[issue25545] email parsing docs: clarify that only ASCII strings are supported

immerrr again Thu, 25 Apr 2019 07:07:39 -0700

immerrr again <imme...@gmail.com> added the comment:

Hi everyone,


It's the first time I'm using this bugtracker, so apologies in advance if I 
manage to break something from the first go.

Not sure if it's the right place to report this, but I have the following repro 
that involves email.message_from_bytes:

In [128]: import email 
     ...: msg_bytes = ( 
     ...:     b'MIME-Version: 1.0\r\n' 
     ...:     b'Content-Type: text/plain;\r\n' 
     ...:     b' charset=utf-8\r\n' 
     ...:     b'Content-Transfer-Encoding: 8bit\r\n' 
     ...:     b'Content-Disposition: attachment;\r\n' 
     ...:     b' filename="camper_store.csv"\r\n\r\n' 
     ...: ) + 'Beyoğlu-İst'.encode('utf8') 
     ...: email.message_from_bytes(msg_bytes).get_payload(decode=True)          
                                                                                
                              
Out[128]: b'Beyo\xc4\x9flu-\xc4\xb0st'

I have read this and some previous bug reports where it was clearly explained 
that message_from_string has its limitations and message_from_bytes should be 
used for better results. And if I'm not mistaken my repro should have it all 
set up correctly: CTE=8bit, body encoded in utf8 which is explicitly indicated 
as the content charset, yet the result is still encoded with 
'raw-unicode-escape'.

Is there something wrong with the input or is it a bug?

Thanks!

----------
nosy: +immerrr again

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue25545>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25545] email parsing docs: clarify that only ASCII strings are supported

Reply via email to