I have found out how to create a utf8 string: insert something with a code
> 255 (a BOM should do it) and then strip it off later. Hacky, but works.

But how do I change the way a string is interpretted?

use utf8;

# other code

sub pretty
{
    my ($str) = @_;

#    $str =~ tr///CC;    # This crashes Perl 5.6.0 (ActivePerl)
#    use bytes;          # This does nothing
    $str =~ s/([\xc0-\xff][\x80-\xbf]+)/'\x{'.sprintf("%04x", unpack("U", $1)).'}'/oge;
    $str;
}

$str is interpretted as UTF8 (SvUTF8 is set).

Any suggestions?

And a follow-up question:

How do I make a UTF8 string containing codes 127<x<256 without having to insert a BOM 
in the front and then strip it off?

Martin Hosken

PS. Apologies for the vague previous question.

Reply via email to