New submission from Leslie P. Polzer: http://hg.python.org/cpython/file/3.3/Lib/smtpd.py#l289
as of now decodes incoming bytes as UTF-8. An SMTP server must not attempt to interpret characters beyond ASCII, however. Originally mail servers were not 8-bit clean, meaning they would only guarantee the lower 7 bits of each octet to be preserved. However even then they were not expected to choke on any input because of attempts to decode it into a specific extended charset. Whenever a mail server does not need to interpret data (like base64-encoded auth information) it is simply left alone and passed through. I am not aware of the reasons that caused the current state, but to correct this behavior and make it possible to support the 8BITMIME feature I suggest decoding received bytes as latin1, leaving it to the user to reinterpret it as UTF-8 or whatever charset they need. Any other simple extended encoding could be used for this, but latin1 is the default in asynchat. The documentation should also mention charset handling. I'll be happy to submit a patch for both code and docs. ---------- components: Library (Lib) messages: 203467 nosy: skypher priority: normal severity: normal status: open title: smtpd.py should not decode utf-8 type: enhancement versions: Python 2.6, Python 2.7, Python 3.1, Python 3.2, Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue19662> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com