Re: MIME.Message and RFC 2047 encodings

Chris Angelico Sun, 30 Oct 2016 05:05:02 -0700

On Sun, Oct 30, 2016 at 9:17 PM, Chris Angelico <ros...@gmail.com> wrote:
> On Sun, Oct 30, 2016 at 4:17 AM, Martin Karlgren <ma...@roxen.com> wrote:
>> Adding charset decoding to MIME.Message sounds good to me, perhaps with a 
>> flag to enable it on decoding?
>> (A compat problem I can think of is that applications may assume that 
>> decoded data is 8bit strings and fail to apply proper encoding before 
>> writing to file, causing an exception.)
>
> I agree about backward compat, and that's a bit problematic. So here's
> my thinking: MIME.UnicodeMessage will be a subclass of MIME.Message
> with the express goal of making everything use 21-bit strings. Any
> time it returns an eight-bit string, that is a bug to be fixed. So
> future incompatibility won't be a problem, as it's expressly
> documented that way; and past compatibility is fine, because
> MIME.Message itself isn't changing. Methods like
> MIME.Message()->get_filename, which currently do the decoding at that
> late point, can simply be overridden in UnicodeMessage.
>
> Does that seem like a reasonable API?


I've pushed a change to 8.1 that ought to be 100% backward compatible.
If there's a problem, I can revert it, but there shouldn't be. (Just
in case, it's not in 8.0.) The two notable features are:

1) MIME.UnicodeMessage, as described above
2) MIME.parse_headers() now takes an additional parameter 'unicode'.

Everything else should be completely invisible to most programs, and
both of these can be ignored.

ChrisA

Re: MIME.Message and RFC 2047 encodings

Reply via email to