While I realise this is diverging slightly from the original posting, I think some background info is useful for dealing with Japanese text. There are several text encoding formats - the most widly used being ShiftJIS and EUC-JP. Without going into too many details, ShiftJIS encoding was created by Microsoft to its usual exacting (lack of) standards, which makes it ticklish to deal with, so in the past when processing Japanese text, Japanese perlers used a four step conversion solution:

(1) input converted from Shift_JIS to EUC_JP
(2) EUC_JP encoded data processed
(3) EUC_JP data converted back to Shift_JIS
(4) output


perl 5.8.0 has built in Unicode support, however the same 4 step process is still required for Shift-JIS data



(1) input converted from Shift_JIS to UTF8 (unicode) (2) UTF8 encoded data processed (3) UTF8 data converted back to Shift_JIS (4) output


MacPerl per se historically has not been aware of locale outside of ascii defined ones (not sure about the latest version). Which is why of course there is MacJPerl.


http://world.std.com/~habilis/macjperl



HTH

Robin





On Wednesday, March 19, 2003, at 05:58 am, Scott R. Godin wrote:

Jon Reinsch wrote:

I use a simple MacPerl program to archive my email: I save each message to
a text file, then run the program to append the messages to a text file in
date/time order. Omitting some details, the heart of the program is just:


open (inhandle,"$infilename"))
{
    while(<inhandle>)
    { print $outhandle $_; }
}

My problem is that some of my email contains Japanese text. I'm running OS
9.2.1 with the Japanese Language Kit installed. But when Japanese text
goes through the program it comes out as garbage like
"bÉvÇ•ñŽéwÇ…ÇÝǻLJÇÃÉiÉrÉQÅ[É^Å". Obviously the encoding is being lost,
but I don't have the slightest idea how to fix this. Is there a module out
there that would provide a simple answer to this problem? Maybe it's just
a fantasy, but I'm hoping for something simple like
print $outhandle convertJapaneseText($_);


This might seem very simple but have you looked into

use locale

at all ? try looking at perldoc perllocale for some informative text.

dunno if this will help but it's where my instincts pointed me...




Reply via email to