>>>>> "Mikhail" == Mikhail Zabaluev <[EMAIL PROTECTED]> writes:
Mikhail> Hello, I'd like to see some I18n issues in Mailman to be Mikhail> addressed prior to the 2.1 release. Basically, it's some Mikhail> bugs or misfeatures related to transformation of Mikhail> MIME-encoded messages. I am working actively with Barry now on Mailman's i18n issues. See my recent patches in the archives. Mikhail> The most serious bug I see here is that messages encoded Mikhail> in base64 still get decorated with plaintext. Headers or bodies? Are you talking about the footer tacked on to the end of messages? If so, it would be simple with the new message structure to make the footer be a separate text part. Though, I don't see how adding some plain text after the end of the boundary could be corrupted; could you put an example corrupted message up? Mikahil> No, wait -- there still is an implicit assumption that Mikahil> message bodies and the decoration text share the same Mikahil> character set. Thus the decorations should be recoded Mikahil> from what character set they are assumed to be in (ASCII? Mikahil> ISO8859-1? UTF-8? Selectable per list?) into the Mikahil> character set of the message. I'll work on addressing this now that we have some code that actually deals with character set issues. Mikhail> Another problem is encoded messages in archives. Heck, Mikhail> look at this list's archive to see what I'm talking Mikhail> about. Those should also be decoded and have character Mikhail> set converted to some uniform one. I'd suggest UTF-8, but Mikhail> many browsers and text viewers still don't grok this Mikhail> charset, so it'd better be selectable as well. I talked with Barry about this today. My solution is to "guess" the character set based on whichever is most common in the archives, and use that as the charset specified in the HTML. For any messages with multi-language subjects or bodies, the main language will be left in the normal character set, and the multi-language parts will be encoded with the UTF-8 HTML entity. This will require Python unicode codecs for all our languages, which do not exist for KOI-8, Big5, or GB, as far as I know. Ben -- Brought to you by the letters B and S and the number 14. "It is sad. *Campers* cannot *dance*. Not even a *party*." Debian GNU/Linux maintainer of Gimp and Nethack -- http://www.debian.org/ _______________________________________________ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers