Mark, > I realize everybody on this list probably knows this already, > but email in 3.X not only doesn't support the Unicode/bytes > dichotomy, it was also broken by it.
Yes, it's a shame that it has worked out that way. I think it's because email is an almost uniquely hard problem when you try to make a sharp distinction between text and bytes. When you receive an email, what have you got? It's supposed to be ASCII, but of course it often isn't. What character set should you assume that those eight-bit characters are in? The program that's using the module probably does want to try to guess since it probably wants to make as much sense as possible out of an incorrectly formed email. The same goes for mis-specified encodings, both in headers and in MIME parts. So you probably need to provide multiple ways of getting at headers and the MIME parts that claim to be text. You'll want to be able to get at the original data (probably as bytes for safety) and the text version if one can be created. And so forth. Happily passing eight-bit strings around with the assumption that the user would make the correct sense of them mapped onto email really well. Trying to make a strict distinction between bytes and text turns out to be a bit of a mess in this context. But you probably already knew all that as well. Regards, Matt _______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com