Just replying to my message as I had no answer at all... Was it a
completely silly question? Is there something which I missed out?

Help!

> Hi list,
> 
>   I am running perl 5.6.1 on a redhat box, and I have come across this
>   wierd (bug|feature|annoying thing). If this problem has been raised
>   before please give me a reference to the "F" manual :-) 
> 
> TEST SCRIPT:
> ============
> 
> use strict;
> use utf8;
> 
> main();
> sub main
> {
>     # \x{A9} is the copyright string
>     #
>     my $data = "Copyright \x{A9} 2001-2002 MKDoc Ltd";
>     my $dlm = '(?:\p{IsSpace}|\p{IsPunct})';
>     my $re = 'MKDoc';
> 
>     print "BEFORE: $data\n";
>     my @split = $data =~ /^(.*?$dlm)($re)($dlm.*?)$/ism;
>     $data = join '', @split;
>     print "AFTER : $data\n";
> }
> 
> 1;
> 
> 
> And here is what I get
> 
> [jhiver@frogette mkdoc]$ perl -w test2.pl
> BEFORE: Copyright © 2001-2002 MKDoc Ltd
> AFTER : Copyright © 2001-2002 MKDoc Ltd
> [jhiver@frogette mkdoc]$ 
> 
> 
> My terminal doesn't support UTF-8, which in this case is good because I
> an see all the caracters... surprise, using regexes capture seems to
> remove string utf8ness although the string IS utf8 and 'use utf8' is
> there...

-- 
IT'S TIME FOR A DIFFERENT KIND OF WEB
================================================================
  Jean-Michel Hiver - Software Director
  [EMAIL PROTECTED]
  +44 (0)114 221 4968
================================================================
                                      VISIT HTTP://WWW.MKDOC.COM

Reply via email to