[bug #16421] regexp bug in grep -P or libpcre

Tony Abou-Assaleh Sun, 07 Oct 2007 21:10:52 -0700

Update of bug #16421 (project grep):

                  Status:                    None => Confirmed


    _______________________________________________________

Follow-up Comment #1:

I see similar behaviour in grep-2.5.3 and libpcre 6.7.

Also:

$ printf "anbn" | src/grep -P '[^a]'
a
b
$ printf "anbn" | src/grep -P '^[^a]'
b
$ printf "anbn" | src/grep -P '[^a]$'
b
$ printf "anbn" | src/grep -P '[^a][^b]'
b
$ printf "anbn" | src/grep -P '[n]'
a
b
$ printf "anbn" | src/grep -P '[^a][n]'
b

It appears that the end-of-line character is passed as part of the string and
matched by [^a]. [^a] in PCRE will match an end-of-line character. From the
pcrepattern man page:

"The newline character is never treated in any special way in character
classes, whatever the setting of the PCRE_DOTALL or PCRE_MULTILINE options is.
A class such as [^a] will always match a newline."

The question is: should grep pass end-of-line character of each line to PCRE?
To be consistent, I think the answer is no.

Care must be taken to handle -z and binary files properly.

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?16421>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/

[bug #16421] regexp bug in grep -P or libpcre

Reply via email to