URL:
<http://savannah.gnu.org/bugs/?27275>
Summary: Wrong behaviour with GREP_USE_DFA=0
Project: grep
Submitted by: None
Submitted on: Tue 18 Aug 2009 09:03:34 AM UTC
Category: None
Severity: 3 - Normal
Item Group: None
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
>From Debian bug #538737. I confirmed it with version 2.5.4
http://bugs.debian.org/538737 :
"Setting the (un-documented) environment variable GREP_USE_DFA to 0 causes
grep to generate incorrect results.
Start with a three line text file:
$ cat file.txt
foo <= space
foo <= tab
sfoo
Now use grep with GREP_USE_DFA set to 1. Notice correct match:
$ GREP_USE_DFA=1 grep '^\s*foo' file.txt
sfoo
Now try with GREP_USE_DFA set to 0.
$ GREP_USE_DFA=0 grep '^\s*foo' file.txt
foo <= space
foo <= tab
The program now matches the other two lines, but not the one that
should match.
This is a serious issue since disabling DFA is the default for
multi-byte encodings such as UTF-8.
You can get the same (wrong) results by setting LANG:
$ LANG=C grep '^\s*foo' file.txt
sfoo
$ LANG=en_US.UTF-8 grep '^\s*foo' file.txt
foo <= space
foo <= tab
"
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?27275>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/