New submission from Mark Sapiro <[email protected]>:
This is similar to https://bugs.python.org/issue32330 but is the opposite
behavior. In that issue, the message couldn't be flattened as a string but
could be flattened as bytes. Here, the message can be flattened as a string but
can't be flattened as bytes.
The original message was created by an arguably defective email client that
quoted a message containing a utf8 encoded RIGHT SINGLE QUOTATION MARK and
utf-8 encoded separately the three bytes resulting in `â**` instead of `’`.
That's not really relevant but is just to show how such a message can be
generated.
The following interactive python session shows the issue.
```
>>> import email
>>> msg = email.message_from_string("""From [email protected] Sat Jan 18
>>> 04:09:40 2020
... From: [email protected]
... To: [email protected]
... Subject: Century Dates for Insurance purposes
... Date: Fri, 17 Jan 2020 20:09:26 -0800
... Message-ID: <[email protected]>
... MIME-Version: 1.0
... Content-Type: text/plain; charset="utf-8"
... Content-Transfer-Encoding: 8bit
...
... Thursday-Monday will cover both days of staging and then storing
goods
... post-century. I think thatâ**s the way to go.
...
... """)
>>> msg.as_bytes()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/email/message.py", line 178, in as_bytes
g.flatten(self, unixfrom=unixfrom)
File "/usr/local/lib/python3.7/email/generator.py", line 116, in flatten
self._write(msg)
File "/usr/local/lib/python3.7/email/generator.py", line 181, in _write
self._dispatch(msg)
File "/usr/local/lib/python3.7/email/generator.py", line 214, in _dispatch
meth(msg)
File "/usr/local/lib/python3.7/email/generator.py", line 432, in _handle_text
super(BytesGenerator,self)._handle_text(msg)
File "/usr/local/lib/python3.7/email/generator.py", line 249, in _handle_text
self._write_lines(payload)
File "/usr/local/lib/python3.7/email/generator.py", line 155, in _write_lines
self.write(line)
File "/usr/local/lib/python3.7/email/generator.py", line 406, in write
self._fp.write(s.encode('ascii', 'surrogateescape'))
UnicodeEncodeError: 'ascii' codec can't encode character '\xe2' in position 33:
ordinal not in range(128)
>>>
```
----------
components: email
messages: 360249
nosy: barry, msapiro, r.david.murray
priority: normal
severity: normal
status: open
title: Email parser creates a message object that can't be flattened as bytes.
versions: Python 3.5, Python 3.6, Python 3.7
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue39384>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com