New submission from Dieter Maurer <die...@handshake.de>: In the transscript below, "ms" and "mb" should be equivalent:
>>> from email import message_from_string, message_from_bytes >>> mt = """\ ... Mime-Version: 1.0 ... Content-Type: text/plain; charset=UTF-8 ... Content-Transfer-Encoding: 8bit ... ... รค ... """ >>> ms = message_from_string(mt) >>> mb = message_from_bytes(mt.encode("UTF-8")) But "mb.as_bytes" succeeds while "ms.as_bytes" raises a "UnicodeEncodeError": >>> mb.as_bytes() b'Mime-Version: 1.0\nContent-Type: text/plain; charset=UTF-8\nContent-Transfer-Encoding: 8bit\n\n\xc3\xa4\n' >>> ms.as_bytes() Traceback (most recent call last): ... File "/usr/local/lib/python3.9/email/generator.py", line 155, in _write_lines self.write(line) File "/usr/local/lib/python3.9/email/generator.py", line 406, in write self._fp.write(s.encode('ascii', 'surrogateescape')) UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 0: ordinal not in range(128) Apparently, the "as_bytes" ignores the "charset" parameter from the "Content-Type" header (it should use "utf-8", not "ascii" for the encoding). ---------- components: email messages: 373711 nosy: barry, dmaurer, r.david.murray priority: normal severity: normal status: open title: "email.message.Message.as_bytes": fails to correctly handle "charset" type: behavior versions: Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue41307> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com