Thanks, David; that's great news. I'll update the book draft accordingly.
For the record, despite the issues, I was able to complete a fairly full-featured email client GUI with the email package as it currently is. This includes parsing and generating arbitrary attachments, as well as encoding on sends and decoding on fetches for both text payloads and I18N mail headers. The package is still quite powerful as is. It does take a bit of digging to figure out how to use its many tools, but the book will probably help on this front, especially the upcoming edition's more complete application. In other words, some of my concern may have been a bit premature. I hope that in the future we'll either strive for compatibility or keep the current version around; it's a lot of very useful code. In fact, I recommend that any new email package be named distinctly, and that the current package be retained for a number of releases to come. After all the breakages that 3.X introduced in general, doing the same to any email-based code seems a bit too much, especially given that the current package is largely functional as is. To me, after having just used it extensively, fixing its few issues seems a better approach than starting from scratch. As far as other issues, the things I found are described below my signature. I don't know what the utf-8 issue is that you refer too; I'm able to parse and send with this encoding as is without problems (both payloads and headers), but I'm probably not using the interfaces you fixed, and this may be the same as one of item listed. Another thought: it might be useful to use the book's email client as a sort of test case for the package; it's much more rigorous in the new edition because it now has to be given 3.X'Unicode model (it's abut 4,900 lines of code, though not all is email-related). I'd be happy to donate the code as soon as I find out what the copyright will be this time around; it will be at O'Reilly's site this Fall in any event. Thanks, --Mark Lutz (http://learning-python.com, http://rmi.net/~lutz) Major issues I found... ------------------------------------------------------------------ 1) Str required for parsing, but bytes returned from poplib The initial decode from bytes to str of full mail text; in retrospect, probably not a major issue, since original email standards called for ASCII. A 8-bit encoding like Latin-1 is probably sufficient for most conforming mails. For the book, I try a set of different encodings, beginning with an optional configuration module setting, then ascii, latin-1, and utf-8; this is probably overkill, but a GUI has to be defensive. ---------------------------------------------------------------- 2) Binary attachments encoding The binary attachments byte-to-str issue that you've just fixed. As I mentioned, I worked around this by passing in a custom encoder that calls the original and runs an extra decode step. Here's what my fix looked like in the book; your patch may do better, and I will minimally add a note about the 3.1.3 and 3.2 fix for this: def fix_encode_base64(msgobj): from email.encoders import encode_base64 encode_base64(msgobj) # what email does normally: leaves bytes bytes = msgobj.get_payload() # bytes fails in email pkg on text gen text = bytes.decode('ascii') # decode to unicode str so text gen works ...plus line splitting logic omitted... msgobj.set_payload('\n'.join(lines)) >>> from email.mime.image import MIMEImage >>> from mailtools.mailSender import fix_encode_base64 # use custom >>> workaround >>> bytes = open('monkeys.jpg', 'rb').read() >>> m = MIMEImage(bytes, _encoder=fix_encode_base64) # convert to ascii >>> str >>> print(m.as_string()[:500]) ------------------------------------------------------------------- 3) Type-dependent text part encoding There's a str/bytes confusion issue related to Unicode encodings in text payload generation: some encodings require the payload to be str, but others expect bytes. Unfortunately, this means that clients need to know how the package will react to the encoding that is used, and special-case based upon that. For example, I needed to pass in str for ASCII and Latin-1 (the former is unencoded and the latter gets QP MIME treatment), but must pass a bytes for UTF-8 (which triggers Base64). That's less than ideal for a client trying to attach arbitrary text parts generically from filenames. Here's the obscure workaround I came up with; the bodytext is str when fetched from an edit window, but may also be loaded from an attachment file. This may or may not have been reported, and it's entirley possible that there's a better solution that I've missed. def fix_text_required(encodingname): """ 4E: workaround for str/bytes combinaton errors in email package; MIMEText requires different types for different Unicode encodings in Python 3.1, due to the different ways it MIME-encodes some types of text; see Chapter 13; the only other alternative is using generic Message and repeating much code; """ from email.charset import Charset, BASE64, QP charset = Charset(encodingname) # how email knows what to do for encoding bodyenc = charset.body_encoding # utf8, others require bytes input data return bodyenc in (None, QP) # ascii, latin1, others require str # on mail sends... # email needs either str xor bytes specifically; if fix_text_required(bodytextEncoding): if not isinstance(bodytext, str): bodytext = bodytext.decode(bodytextEncoding) else: if not isinstance(bodytext, bytes): bodytext = bodytext.encode(bodytextEncoding) # later msg.set_payload(bodytext, charset=bodytextEncoding) ...or... msg = MIMEText(bodytext, _charset=bodytextEncoding) mainmsg.attach(msg) # attachments # build sub-Message of appropriate kind maintype, subtype = contype.split('/', 1) if maintype == 'text': # 4E: text needs encoding if fix_text_required(fileencode): # requires str or bytes data = open(filename, 'r', encoding=fileencode) else: data = open(filename, 'rb') msg = MIMEText(data.read(), _subtype=subtype, _charset=fileencode) data.close() ------------------------------------------------------------------- There are some additional cases that now require decoding per mail headers today due to the str/bytes split, but these are just a normal artifact of supporting Unicode character sets in general, ans seem like issues for package client to resolve (e.g., the bytes returned for decoded payloads in 3.X didn't play well with existing str-based text processing code written for 2.X). ------------------------------------------------------------------- -----Original Message----- >From: "R. David Murray" <rdmur...@bitdance.com> >Sent: Jun 4, 2010 12:39 PM >To: l...@rmi.net >Cc: email-sig@python.org >Subject: email package status in 3.X > >On Mon May 10 20:02:46 CEST 2010 Mark Lutz wrote: >> I'm probably going to have to go ahead and finish the book >> with the email package as it is now, and include a lot of >> caveats about the problems that a new version may fix in the >> future. I can also post updated example code if/when possible. >> >> I realize everybody on this list probably knows this already, >> but email in 3.X not only doesn't support the Unicode/bytes >> dichotomy, it was also broken by it. Beyond the pre-parse >> decode issue, its mail text generation really only works for >> all-text mails. Generating text of an email with any sort of >> binary part doesn't work at all now, because the base64 text >> is still bytes, and the Generator expects str. I've coded a >> custom encoder to pass to MIMEImage that works around this >> by decoding to ASCII, but it's not a great story to have to >> tell the tens of thousands of readers of this book, many of >> whom will be evaluating 3.X in general. > >This bug should now be fixed in both the py3k branch and the 3.1 >maint branch. This means the fix will be in 3.1.3, as well as 3.2a1. >Hopefully that will be in time for your book, since 3.2a1 is due June >27th and I'm guessing the 3.1.3 release will be some time not too far >off that time frame as well. FYI I also fixed a related bug that made >using utf-8 as a charset problematic. Unfortunately I suspect there >maybe some other charset issues waiting to be discovered. > >If you have come across any other bugs that don't already have >issues in the tracker please file bug reports. Anything that >can be fixed in the current package I will endeavor to fix >before the next release. Feel free also to indicate bugs which >should be given priority. > >-- >R. David Murray www.bitdance.com _______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com