On December 2, 2002 at 10:57, Koichi Nakatani wrote: > >>You're right, and current MHonArc cannot treat this correctly. > > > > Can someone provide me with a sample message that shows this problem? > > I'd like to have it as a test case. > > I think messages from Mr. Ogawa are broken, and I cannot see any > correct methods to handle incorrect messages. > > According to RFC 2047, you have to use encoded-words to embed > non ASCII characters in message headers.
The message is legal. As you noted in a later, it is the iso-2022-jp encoding. MHonArc's mail address detection does not consider encoding, it works at the raw octet level. Now, the newer MHonArc::CharEnt in CVS and in the snapshot builds converts iso-2022-jp to Unicode character entity references. When using it, the mailto linking works as expected. The from field: =?iso-2022-jp?B?GyRCPi5AbhsoQg==?= <[EMAIL PROTECTED]> is converted to the following: 小川<<a href="mailto:[EMAIL PROTECTED]" >[EMAIL PROTECTED]</a>> [line break added for readability] I'm assuming the Unicode values are correct since I cannot read Japanese. Therefore, the question is, "Should MHonArc::CharEnt replace iso2022jp.pl as the default converter for iso-2022-jp data?" (I asked the question on the mhonarc-dev list, but since the number of subscribers is small, I'll ask on this list.) MHonArc::CharEnt is written in pure Perl and does not depend on any non-standard modules, so it should work under any version of Perl 5 (mainly versions <5.6.1). Testing with Mozilla (via Galeon) on Linux and testing Mozilla and IE 6 on Windows, all browsers are able to load the proper font glyphs (if installed -- which took me some time to find fonts for Windows that I could install) for Unicode character entity references, independent on what the actual document character set is. I have not tested text browsers like w3m or lynx. --ewh --------------------------------------------------------------------- To sign-off this list, send email to [EMAIL PROTECTED] with the message text UNSUBSCRIBE MHONARC-USERS
