Egmont Koblinger
Wed, 01 Dec 2004 16:01:33 -0800
This mail is an automated notification from the bugs tracker of the project: MHonArc. /**************************************************************************/ [bugs #11187] Full Item Snapshot: URL: <http://savannah.nongnu.org/bugs/?func=detailitem&item_id=11187> Project: MHonArc Submitted by: Egmont Koblinger On: Thu 12/02/04 at 00:04 Category: Character Sets Severity: 5 - Average Item Group: Incorrect Behavior Resolution: None Privacy: Public Assigned to: None Status: Open Platform Version: Linux Perl Version: 5.8.5 Component Version: 2.6.10 Fixed Release: Summary: incorrectly parsing UTF-8 encoded messages Original Submission: I use mhonarc without any configuration file, just simply the command "mhonarc -outdir outdir indir" whereas "indir" only contains one file with one single message encoded in UTF-8. (Both the subject and the body contain UTF-8 encoded accented letters, the subject uses quoted-printable, the body's transfer encoding is 8-bit). The output html files are quite strange. For each UTF-8 byte sequence only the first byte is taken into account and it is converted to a html escape. For example, the Euro sign (U+20AC, UTF-8: E2 82 AC) will appear in the html output as "&#E2;" and then 82 and AC are skipped, processing goes on with the next Unicode character. In MHonarc/CharEnt.pm line 153 there's a switch to check whether perl is new enough to support UTF-8. If it isn't, then manual processing of UTF-8 character takes place. Forcing the "non-UTF-8-aware perl" branch of the "if" statement (that is, changing the "if ($] >= 5.006)" to "if (0)" repairs the problem, in this case the output will be the expected "AC;". I don't think it matters, but I have LANG=hu_HU (latin2 locale) and no other LC_* variables set. However, UTF-8 locales are also available on my system. For detailed info, follow this link: <http://savannah.nongnu.org/bugs/?func=detailitem&item_id=11187> _______________________________________________ Message sent via/by Savannah http://savannah.nongnu.org/ --------------------------------------------------------------------- To sign-off this list, send email to [EMAIL PROTECTED] with the message text UNSUBSCRIBE MHONARC-DEV