On Oct 10, 2009, at 9:59 AM, Stephen J. Turnbull wrote:

Both.  I *believe* (but it needs to be checked) that in a correctly
formed multipart MIME object (message or part), any internal structure
is context-free within the MIME boundaries.  If that is so, then
individual parts of the object can be stored in raw form and parsed
lazily.

I too /think/ that's correct. There are some MIME content-types that cause parts to be related (e.g. multipart/alternative and multipart/ related), but those are all operating at a higher level.

In practice it probably makes sense to parse all the headers right away. Content-Type has the most bearing on parsing the rest of the stuff, so by that time you already need to parse parameters to e.g. get the boundary. Early on I claimed that headers were so manageable in practice that we could implement an ordered-dictionary with duplicates as a simple list, with linear searching and nobody would notice. I think nobody has noticed ;).

Lazy parsing of the body does make sense. You only need to parse enough to find end boundaries, or recurse into parsing an embedded part. This is how the parser currently works anyway.

-Barry

Attachment: PGP.sig
Description: This is a digitally signed message part

_______________________________________________
Email-SIG mailing list
Email-SIG@python.org
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com

Reply via email to