https://bugs.exim.org/show_bug.cgi?id=1848

            Bug ID: 1848
           Summary: pcregrep outputs duplicate matches
           Product: PCRE
           Version: 8.38
          Hardware: x86
                OS: Linux
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: p...@hermes.cam.ac.uk
          Reporter: d.f.fisc...@web.de
                CC: pcre-dev@exim.org

Created attachment 895
  --> https://bugs.exim.org/attachment.cgi?id=895&action=edit
test case

Attached is an input file for pcretest. Newlines expanded for better
readability, it searches for the multiline pattern 

    match (\d+):
     (.)

in the text

    match 1:
     a
    match 2:
     b
    match 3:
     c
    match 4:
     d
    match 5:
     e

Note that pattern and text both end with a newline. The bug only appears when
this is the case. When it is run through pcretest, five matches are found as
expected.

    $ pcretest inputPCRE version 8.38 2015-11-23
    ~match (\d+):\n (.)\n~Gm
    match 1:\n a\nmatch 2:\n b\nmatch 3:\n c\nmatch 4:\n d\nmatch 5:\n e\n
     0: match 1:\x0a a\x0a
     1: 1
     2: a
     0: match 2:\x0a b\x0a
     1: 2
     2: b
     0: match 3:\x0a c\x0a
     1: 3
     2: c
     0: match 4:\x0a d\x0a
     1: 4
     2: d
     0: match 5:\x0a e\x0a
     1: 5
     2: e

But when the same is attempted using pcregrep instead, the second match is
duplicated, the third match appears tripled, the fourth quadrupled, et cetera. 

    $ tail -n1 input | sed 's/\\n/\n/g' | \
    $   pcregrep --om-separator / -Mo0 -o1 -o2 \
    $   "$(pcregrep -o1 '~(.+)~' input)"
    match 1:
     a
    /1/a
    match 2:
     b
    /2/b
    match 3:
     c
    /3/c
    match 4:
     d
    /4/d
    match 5:
     e
    /5/e
    match 2:
     b
    /2/b
    match 3:
     c
    /3/c
    match 4:
     d
    /4/d
    match 5:
     e
    /5/e
    match 3:
     c
    /3/c
    match 4:
     d
    /4/d
    match 5:
     e
    /5/e
    match 4:
     d
    /4/d
    match 5:
     e
    /5/e
    match 5:
     e
    /5/e

Instead, pcregrep should output each correctly found match only a single time.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to