Am Donnerstag, 14. Februar 2008 schrieb Nando:
> Gentlemen, please consider the following ipython session:
>
>
> In [98]: m = email.message_from_file(f)
>
> In [99]: print m["subject"]
> =?utf-8?b?W291aS5jb20uYnJdIENhcnTDo28gZGUgY3LDqWRpdG8gdGVyw6EgbGVn?=
>         =?utf-8?b?aXNsYcOnw6NvIGVzcGVjw61maWNh?=
>
>
> It gives me the raw subject header value. Now of course I just wanted
> the header in unicode. So I have to do:
>
>
> In [100]: from email.header import decode_header
>
> In [101]: decode_header(m["subject"])
> Out[101]:
> [('[oui.com.br] Cart\xc3\xa3o de cr\xc3\xa9dito ter\xc3\xa1
> legisla\xc3\xa7\xc3\xa3o espec\xc3\xadfica',
>   'utf-8')]

Nando, you're just a lucky camper in that case. How would you handle a 
mixture of say: big5, euc_jp, koi8_r _and_ utf-8 encodings. Please don't 
claim, that this is unlikely. Sure it is, but never the less, it happens, 
and does your code gets this pathological case right? 

Wait, let's normalize them - but how do we handle encoding failures? 
Remember, there are way too many MUAs, mailing list managers, email 
gateways, autoresponder, etc. out there, which get this wrong! 

Next you ask for email.Message to reparse email addresses to conform to RFC 
2822, and voila, you created a unmanageable creature called Frankenstein.. 

If you think about the consequences, you will understand, that Barry and 
friends will do _everything_ to keep this can o'worms closed in this 
context.

Pete
_______________________________________________
Email-SIG mailing list
[email protected]
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com

Reply via email to