Can anyone enlighten me as to why \W behaves differently depending on wether it's inside or outside of a character class, for certain characters:
This sample program:
use encoding 'utf8'; $x = 'GroÃbritannien'; $\ = "\n"; print '1 ', $x =~ /(\W+)/; print '2 ', $x =~ /([\W]+)/; print '3 ', $x =~ /(\w+)/;
...prints:
1 2 Ã 3 GroÃbritannien
I do not understand why the Eszett matches [\W] in #2. Same behavior if I replace the Eszett with another, non ASCII, "letter", e.g. "Ã".
-- Eric Cholet