On Fri, Jan 20, 2006 at 10:14:36AM +0800, Orlando Andico wrote:
> On 1/19/06, Sherwin Daganato <[EMAIL PROTECTED]> wrote:
> >
> > Have you tried utf8::upgrade()?
> 
> my understanding is that the utf8 module is deprecated in perl 5.8+

That's not what 'perldoc perlunicode' says. It just states that the use
of "use utf8" to declare the operations in the current block to be
unicode-aware is no longer necessary in Perl 5.8+. So you can still use
the functions (like is_utf8()) provided by the utf8 module.

> a completely "dumb" implementation would be, if the sequence \xDD is seen,
> replace it with chr(hex(DD)). doesn't appeal to my sense of elegance though.

That will work for '\xDD' but not for "\xDD" since the latter will be
interpolated. Even then the utf8ness flag of the scalar is still off
so you will still need to convert it to utf8.

'perldoc perlunicode' suggests the use of unpack and pack to convert to
utf8. i.e.

pack('U*', unpack('C*', "\xE7\xB9\x81\xE9\xAB\x94\xE4\xB8\xAD\xE6\x96\x87")

For '\xDD', as an alternative to chr(hex(DD)) + regex, you may also use
unpack and pack. e.g.

pack('H2' x 12, unpack('x2A2' x 12, 
'\xE7\xB9\x81\xE9\xAB\x94\xE4\xB8\xAD\xE6\x96\x87'));


HTH
-- 
$_=q:; # SHERWIN #
70;72;69;6e;74;20;
27;4a;75;73;74;20;
61;6e;6f;74;68;65;
72;20;50;65;72;6c;
20;6e;6f;76;69;63;
65;27;:;;s=~?(..);
?=pack q$C$,hex$1;
;;;=egg;;;;eval;;;
_________________________________________________
Philippine Linux Users' Group (PLUG) Mailing List
[email protected] (#PLUG @ irc.free.net.ph)
Read the Guidelines: http://linux.org.ph/lists
Searchable Archives: http://archives.free.net.ph

Reply via email to