Andreas J Koenig wrote in perl.unicode : >>>>>> On Wed, 31 Dec 2003 16:21:36 +0100, Eric Cholet <[EMAIL PROTECTED]> said: > > > Can anyone enlighten me as to why \W behaves differently depending > > on wether it's inside or outside of a character class, for certain > > characters: > > I have reported this as bug 18281 > > http://guest:[EMAIL PROTECTED]/rt3/Ticket/Display.html?id=18281 > > I don't think that it is documented by now and I cannot spot a good > place where it needs to be documented. perlre.pod and perlunicode.pod > seem the natural places.
And apparently fixing it is not trivial. Does something like this suit you ? This can at least make its way into 5.8.3. Change 22031 by [EMAIL PROTECTED] on 2004/01/01 16:30:13 Document that /[\W]/ doesn't work, unicode-wise (see bug #18281) Affected files ... ... //depot/perl/pod/perlunicode.pod#130 edit Differences ... ==== //depot/perl/pod/perlunicode.pod#130 (text) ==== @@ -166,6 +166,10 @@ Unicode properties database. C<\w> can be used to match a Japanese ideograph, for instance. +(However, and as a limitation of the current implementation, using +C<\w> or C<\W> I<inside> a C<[...]> character class will still match +with byte semantics.) + =item * Named Unicode properties, scripts, and block ranges may be used like