Dear audience, when coming back to this list, I couldn't believe my eyes because of the low volume level, but after rechecking with the archives, I have to accept, it is that quiet here, a bit too quiet from my POV. Hmm.
Well, I'm in the course of replacing a special purpose postfix email filter, that is dating back to 2004 with a redeveloped Python 3 version right now. Basically all it is doing (in pseudo code): msg = email.message_from_file(fp) processing(msg) write(msg.as_string(True)) for a few 100 million mails during that time. After replacing it with: msg = email.message_from_binary_file(fp, policy = email.policy.SMTP) processing(msg) BytesGenerator(pipe).flatten(msg) Here, processing mostly saves bodies and attachments, depending on pattern matches and adds some headers. I was quite astonished to find out, that this procedure isn't working that well anymore: the email module appears way more sensible in the current state. This is a bit disappointing, as reading the docs conveys, that some effort was put into reliability and robustness. Given the much improved unicode handling of Python 3 itself and the ever improving experience in handling emails, this is contrary to my expectations, I have to confess. Minutes after switching to the new code, I stumbled across a traceback in msg.get_all('to') from a header like this: To: unlisted-recipients: ;, ""@pop.kundenserver.de (no To-header on input) Hmm, not nice. http://bugs.python.org/issue27257 Next, I wondered, that arbitrary header data appears in the body of some mail in my MUA. Tracked down to a mangled header, that has lost proper indentation: X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtCTDJQUjAyTUI1MTQ7MjM6bEtRRlNaUHQvVTk5WCttdktlOUVrUGQvVFBH?= =?utf-8?B?cDFJemVUeXFzOGNzYnZOYWlwMDZpR0YzbXZyY09WaTBKM2pkeUl4S1VDMkxw?= =?utf-8?B?eVRkNWthRW9waUhJTzczTWd5WDZOQ3hMNU1haGFvQTVzVTdRZmxJUnZlblpW?= ... versus: X-Microsoft-Exchange-Diagnostics: 1;BL2PR02MB514;23:lKQFSZPt/U99X+mvKe9EkPd/TPG p1IzeTyqs8csbvNaip06iGF3mvrcOVi0J3jdyIxKUC2Lp yTd5kaEopiHIO73MgyX6NCxL5MahaoA5sU7QflIRvenZV Oh, well. http://bugs.python.org/issue27256 Before I added some code to circumvent those occurrences, I stumbled across a traceback in flatten: http://bugs.python.org/issue27258 All these issues were harvested in less than halve an hour. What really troubles me is the quietness around here in the light of this experience. Doesn't people use Python (3) yet/anymore for these kind of tasks? Does somebody care? Am I missing something? I will do my best to dive into these issues in the next days/weeks, but would appreciate a dialog with somebody, who is involved in the email module code already. Thanks, Pete _______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: https://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com