Nando wrote:
> Gentlemen, please consider the following ipython session:
>
>
> In [98]: m = email.message_from_file(f)
>
> In [99]: print m["subject"]
> =?utf-8?b?W291aS5jb20uYnJdIENhcnTDo28gZGUgY3LDqWRpdG8gdGVyw6EgbGVn?=
> =?utf-8?b?aXNsYcOnw6NvIGVzcGVjw61maWNh?=
>
>
> It gives me the raw subject header value. Now of course I just wanted
> the header in unicode. So I have to do:
>
>
> In [100]: from email.header import decode_header
>
> In [101]: decode_header(m["subject"])
> Out[101]:
> [('[oui.com.br] Cart\xc3\xa3o de cr\xc3\xa9dito ter\xc3\xa1
> legisla\xc3\xa7\xc3\xa3o espec\xc3\xadfica',
> 'utf-8')]
>
> In [102]: print decode_header(m["subject"])[0][0]
> [oui.com.br] Cartão de crédito terá legislação específica
>
>
> My questions are:
> 1) Why does not it currently return the *decoded* header?
Because you often need access to the raw header. Also, not all headers are
encoded the same. While what you have works for Subject:, it doesn't work
for To:, Reply-To:, From: etc.
> 2) Would it break too many apps if we changed it?
Yes. Particularly apps that need to log or report broken email headers that
cannot be decoded.
> 2.1) If it would, can we add a function such as
> message.getheader("subject") for this?
> 2.1.1) Would you like me to propose a patch with the obvious implementation?
I'd love to see things become more Unicode aware.
Perhaps return an object implementing __str__() and __unicode__() (or
decode()). The cast-to-unicode conversion would decode headers with known
encodings and raise an exception on headers with unknown encodings.
Similarly, setting headers using Unicode strings would use the known
encodings to perform the reverse operation. And you still have access to the
raw value if you want to round trip.
--
Stuart Bishop <[EMAIL PROTECTED]>
http://www.stuartbishop.net/
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Email-SIG mailing list [email protected] Your options: http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com
