URL:
<http://savannah.gnu.org/bugs/?32337>
Summary: [:alnum:] depends on locale
Project: grep
Submitted by: None
Submitted on: Do 03 Feb 2011 12:07:11 UTC
Category: None
Severity: 3 - Normal
Item Group: None
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
_______________________________________________________
Details:
In the grep man page it is written: "For example, [[:alnum:]]
means [0-9A-Za-z], except the latter form depends upon the C
locale and the ASCII character encoding, whereas the former is
independent of locale and character set."
However, in my UTF-8 environment (LANG=de_DE.UTF-8) I get
$ echo -e "Haus\nHäuser\n" | LANG=de_DE.UTF-8 ./grep -E '^[[:alnum:]]+$'
Haus
Häuser
$ echo -e "Haus\nHäuser\n" | LANG=C ./grep -E '^[[:alnum:]]+$'
Haus
>From the man-page I expected both cases to match only against the first line.
grep-version: 2.7 (also confirmed on 2.5.3 and 2.5.4)
debian lenny
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?32337>
_______________________________________________
Nachricht geschickt von/durch Savannah
http://savannah.gnu.org/