On Oct 8, 2009, at 6:39 PM, Glenn Linderman wrote:

1) wire format. Either what came in, in the parser case, or what would be generated.
2) internal headers from the MIME part
3) decoded BLOB. This means that quopri and base64 are decoded, no more and no less. This is bytes. No headers, only payload. For Content-Transfer-Encoding: binary, this is mostly a noop. 4) text/* parts should also be obtainable as str()/unicode(), payload only. This is where charset decoding is done.

I think your talk in the next paragraph about hooks and other object types being produced is a generalization of 4, not 3, and generally no additional decoding needs to be done, just conversion to the right object type (or file, or file-like object).

I mostly agree with that. I've always called #4 the "decoded payload" and #3 I've usually called the "raw payload". Maybe we can bikeshed on better terms to help inform us about the API's method/attribute names.

Which brings up another point: right now Message objects have a single .get_payload() method that takes a flag to indicate whether it should be the decoded or raw payload. That's bong. These should be different interfaces.

The problem is that if the bytes came off the wire, the parser currently can only attach the most basic MIME base class. It doesn't know that an image/png should create a MIMEImagePNG instance there. This is different from hacking the model directly because the application can instantiate the right class. So the parser either has to have a hookable way for an application to go from content-type to class, or the generic MIME base class needs to be hookable in its .decode() method.

So either the email package can stop at 3, and 4 only for text/* parts, or it could learn more types (registered types, with well- defined corresponding objects could be potentially built-in to the email package), and/or it could become hookable for application types. Of course, for disposition to files, storing the BLOB in a file of the right name is adequate... to avoid the file, I agree that converting to a useful object type is handy. But maybe file- like objects would suffice, for most of the types.

My own preferences here is that email does support #4 with a registration system to handle returning concrete payload objects based on the Content-Type.

I also think that the email package probably should not implement "store-payloads-on-disk" by default, although it may provide some example implementations for simple applications (much the same way there's wsgiref for simple applications). Still, that's different than say, storing attachments in a file named by the Content- Disposition header's filename parameter. That latter is firmly in the domain of the application.

-Barry

Attachment: PGP.sig
Description: This is a digitally signed message part

_______________________________________________
Email-SIG mailing list
[email protected]
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com

Reply via email to