I use a simple MacPerl program to archive my email: I save each message to a text file, then run the program to append the messages to a text file in date/time order. Omitting some details, the heart of the program is just:
open (inhandle,"$infilename")) { while(<inhandle>) { print $outhandle $_; } }
My problem is that some of my email contains Japanese text. I'm running OS 9.2.1 with the Japanese Language Kit installed. But when Japanese text goes through the program it comes out as garbage like "bÉvÇ•ñŽéwÇ…ÇÝǻLJÇÃÉiÉrÉQÅ[É^Å".
This looks like garbage, but it is not.
Obviously the encoding is being lost, but I don't have the slightest idea how to fix this.
Nothing lost, no need to fix anything.
Is there a module out there that would provide a simple answer to this problem?
There is not one simple answer but many of them, depending on what you intend to do with the result:
- Paste *as text only* (provided your text editor/word processor provides that function) into an empty document, *while the current input method is Japanese* (i.e. kotoeri, ATOK, or whatever you use).
- Paste the "garbage" in a word processor, select all, and force apply a Japanese font on the pseudo-roman text by holding down option or shift or command (depends on the software you are using) while selecting a Japanese font from the font menu (NisusWriter, TexEdit Plus for instance, do this, RagTime does not).
- Make the default ouput font a Japanese font in MacPerl before printing to stdout etc, etc.
The problem, at the bottom line, is that - at least afaik - MacPerl doesn't handle text styles, i.e. font information. As far as the SJIS encoding, however, it is preserved. (Note that RegEx gets a bit complicated under 5.2.0; there was a Japanized version out there somewhere, which could do Japanese RegEx).
BTW, the "garbage" you sent us reads "b bu o mezasu anata no nabigêta" which of course isn't very meaningful (albeit not nonsensical), but only due to your arbitrary clipping of mojibake.
Cheers, --
___ Peter Hartmann ________
mailto:[EMAIL PROTECTED]