Hi list,

Can anyone enlighten me as to why \W behaves differently depending
on wether it's inside or outside of a character class, for certain
characters:

This sample program:

use encoding 'utf8';
$x = 'GroÃbritannien';
$\ = "\n";
print '1 ', $x =~ /(\W+)/;
print '2 ', $x =~ /([\W]+)/;
print '3 ', $x =~ /(\w+)/;

...prints:

1
2 Ã
3 GroÃbritannien

I do not understand why the Eszett matches [\W] in #2. Same behavior
if I replace the Eszett with another, non ASCII, "letter", e.g. "Ã".

--
Eric Cholet



Reply via email to