URL:
<http://savannah.gnu.org/bugs/?36703>
Summary: Change of behaviour
Project: grep
Submitted by: None
Submitted on: Thu 21 Jun 2012 22:42:19 UTC
Category: None
Severity: 3 - Normal
Item Group: None
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
Before 7a0ad00f634237b753572378289d76fa8f1c5942:
$ grep 'foo.*\<bar' /tmp/t
fooxxxxx_bar
$
7a0ad00f634237b753572378289d76fa8f1c5942 and after:
$ src/grep 'foo.*\<bar' /tmp/t
$
(/tmp/t contains only that one line.)
The grep that I built to bisect this links to Ubuntu 10.10's libpcre3, which
apt-cache show libpcre3 shows as being:
Architecture: amd64
Source: pcre3
Version: 8.02-1
I'm not sure how relevant pcre is, as I'm not using -P.
I don't feel qualified to determine which of the behaviours is correct, but
the reason I hunted down this change in behaviour is that the underscore not
forming part of the "word" surprised me.
It turns out also that before 7a0ad00, the output depends on whether I set
LC_ALL=C or LC_ALL=en_ZA.utf8; 7a0ad00 and after, the output is empty
regardless of which of those two locales I set.
In any case, this change of behaviour seems unrelated to the commit message:
commit 7a0ad00f634237b753572378289d76fa8f1c5942
Author: Paolo Bonzini <[email protected]>
Date: Mon Apr 19 14:50:23 2010 +0200
dfa: optimize UTF-8 period
* NEWS: Document improvement.
* src/dfa.c (struct dfa): Add utf8_anychar_classes.
(add_utf8_anychar): New.
(atom): Simplify if/else nesting. Call add_utf8_anychar for ANYCHAR
in UTF-8 locales.
(dfaoptimize): Abort on ANYCHAR.
Submitter: Bernd Jendrissek (not logged in)
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?36703>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/