Follow-up Comment #1, bug #19637 (project grep):

Confirmed.  So this is two bugs:

1) The man page should not say that \w equals [[:alnum:]], as the first
includes also the underscore.
2) \w does not match accented characters in an utf8 locale.


$ export LC_ALL=nl_NL
$ echo -e " ee\n ëë\n __\n" | src/grep -E '\w'
 ee
 ëë
 __
$ echo -e " ee\n ëë\n __\n" | src/grep -E '[[:alnum:]]'
 ee
 ëë

[switch the Konsole's encoding from iso-8859-1 to utf8 and retype the lines
instead of recalling history]

$ export LC_ALL=nl_NL.utf8
$ echo -e " ee\n ëë\n __\n" | src/grep -E '\w'
 ee
 __
$ echo -e " ee\n ëë\n __\n" | src/grep -E '[[:alnum:]]'
 ee
 ëë


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?19637>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/



Reply via email to