Follow-up Comment #1, bug #19637 (project grep):
Confirmed. So this is two bugs:
1) The man page should not say that \w equals [[:alnum:]], as the first
includes also the underscore.
2) \w does not match accented characters in an utf8 locale.
$ export LC_ALL=nl_NL
$ echo -e " ee\n ëë\n __\n" | src/grep -E '\w'
ee
ëë
__
$ echo -e " ee\n ëë\n __\n" | src/grep -E '[[:alnum:]]'
ee
ëë
[switch the Konsole's encoding from iso-8859-1 to utf8 and retype the lines
instead of recalling history]
$ export LC_ALL=nl_NL.utf8
$ echo -e " ee\n ëë\n __\n" | src/grep -E '\w'
ee
__
$ echo -e " ee\n ëë\n __\n" | src/grep -E '[[:alnum:]]'
ee
ëë
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?19637>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/