Hello Jon,

Japanese emails use generally the encoding called iso-2022-jp (or "jis" code) to transfer text through the Internet, while Mac OS (classic) uses Shift-JIS encoding to display Japanese. So, you will have to convert your email text from iso-2022-jp to Shift-JIS. There is a Japanese code converter script named "jcode.pl", that you can get from:
<ftp://ftp.iij.ad.jp/pub/IIJ/dist/utashiro/perl/>


A sample code:
require "jcode.pl";

while (defined($line = <>)) {
    $res = &jcode::sjis($line);
    print $res;
}

I have not tested this code, so I am not sure it it will work, but you will have some idea at least...

Best regards,

Nobumi Iyanaga
Tokyo,
Japan

On Tuesday, March 18, 2003, at 03:22 PM, Jon Reinsch wrote:

Thanks for your reply. The intended ultimate viewer app is Clarisworks, but
I'd be happy with SimpleText. The Japanese displays correctly in Outlook
Express. When I save the message to a text file, it displays fine in
SimpleText. But after the script appends the message to the archive text
file, it's just garbage. I guess I should look at these files with a hex
editor, but I don't have one on this machine right now.


on 3/17/03 9:53 PM, Joshua Juran at [EMAIL PROTECTED] wrote:

I have a different problem with receiving emails with Japanese text, but my
Bayesian spam filter takes care of it quite nicely. :-)


More constructive response follows, interleaved.

--On Monday, March 17, 2003 8:50 PM -0800 Jon Reinsch
<[EMAIL PROTECTED]> wrote:

I use a simple MacPerl program to archive my email: I save each message
to a text file, then run the program to append the messages to a text
file in date/time order. Omitting some details, the heart of the program
is just:


open (inhandle,"$infilename"))
{
while(<inhandle>)
{ print $outhandle $_; }
}

Basically works like 'cat'. Or 'Catenate', if you use MPW.


My problem is that some of my email contains Japanese text. I'm running OS
9.2.1 with the Japanese Language Kit installed. But when Japanese text
goes through the program it comes out as garbage like
"bÉvÇ?ñ?éwÇ?Ç?Ç»Ç?ÇÃÉiÉrÉQÅ[É^Å". Obviously the encoding is being lost,
but I don't have the slightest idea how to fix this. Is there a module
out there that would provide a simple answer to this problem? Maybe it's
just a fantasy, but I'm hoping for something simple like
print $outhandle convertJapaneseText($_);

Convert it to what? I suspect that the problem appears intractable because
in fact, it doesn't really exist. Your script does a byte-for-byte copy,
so nothing is getting lost.


The real problem (as far as I can tell) is that the Japanese text is not
being
*displayed* properly. They're multi-byte data -- no conversion will adapt
them to single-byte display. You'd have the same problem with HTML or
base64-encoded content -- you'll see the raw message, not the
rendered/decoded content.


I don't know what app you're using to ultimately view the messages
(MacPerl?), but that's where your problem (as well as its solution) lies.


Let me know if this helps, or if I'm completely off-base. :-)

Josh







Reply via email to