On Sun, Oct 30, 2016 at 9:17 PM, Chris Angelico <ros...@gmail.com> wrote: > On Sun, Oct 30, 2016 at 4:17 AM, Martin Karlgren <ma...@roxen.com> wrote: >> Adding charset decoding to MIME.Message sounds good to me, perhaps with a >> flag to enable it on decoding? >> (A compat problem I can think of is that applications may assume that >> decoded data is 8bit strings and fail to apply proper encoding before >> writing to file, causing an exception.) > > I agree about backward compat, and that's a bit problematic. So here's > my thinking: MIME.UnicodeMessage will be a subclass of MIME.Message > with the express goal of making everything use 21-bit strings. Any > time it returns an eight-bit string, that is a bug to be fixed. So > future incompatibility won't be a problem, as it's expressly > documented that way; and past compatibility is fine, because > MIME.Message itself isn't changing. Methods like > MIME.Message()->get_filename, which currently do the decoding at that > late point, can simply be overridden in UnicodeMessage. > > Does that seem like a reasonable API?
I've pushed a change to 8.1 that ought to be 100% backward compatible. If there's a problem, I can revert it, but there shouldn't be. (Just in case, it's not in 8.0.) The two notable features are: 1) MIME.UnicodeMessage, as described above 2) MIME.parse_headers() now takes an additional parameter 'unicode'. Everything else should be completely invisible to most programs, and both of these can be ignored. ChrisA