Just replying to my message as I had no answer at all... Was it a completely silly question? Is there something which I missed out?
Help! > Hi list, > > I am running perl 5.6.1 on a redhat box, and I have come across this > wierd (bug|feature|annoying thing). If this problem has been raised > before please give me a reference to the "F" manual :-) > > TEST SCRIPT: > ============ > > use strict; > use utf8; > > main(); > sub main > { > # \x{A9} is the copyright string > # > my $data = "Copyright \x{A9} 2001-2002 MKDoc Ltd"; > my $dlm = '(?:\p{IsSpace}|\p{IsPunct})'; > my $re = 'MKDoc'; > > print "BEFORE: $data\n"; > my @split = $data =~ /^(.*?$dlm)($re)($dlm.*?)$/ism; > $data = join '', @split; > print "AFTER : $data\n"; > } > > 1; > > > And here is what I get > > [jhiver@frogette mkdoc]$ perl -w test2.pl > BEFORE: Copyright © 2001-2002 MKDoc Ltd > AFTER : Copyright © 2001-2002 MKDoc Ltd > [jhiver@frogette mkdoc]$ > > > My terminal doesn't support UTF-8, which in this case is good because I > an see all the caracters... surprise, using regexes capture seems to > remove string utf8ness although the string IS utf8 and 'use utf8' is > there... -- IT'S TIME FOR A DIFFERENT KIND OF WEB ================================================================ Jean-Michel Hiver - Software Director [EMAIL PROTECTED] +44 (0)114 221 4968 ================================================================ VISIT HTTP://WWW.MKDOC.COM