On Jan 25, 2010, at 03:10 PM, R. David Murray wrote: >After setting it aside for a bit, I had what I think is a little epiphany: >our need is to deal with messages (and parts of messages) that could be >in either bytes form or text form. The things we need to do with them >are similar regardless of their form, and so we have been talking about a >"dual API": one method for bytes and a parallel method for text. > >What if we recognize that we have two different data types, bytes messages >and text messages? Then the "dual API" becomes a more uniform, almost >single, API, but with two possible underlying data types.
I really like this, especially because it kind of mirrors the transformations between bytes and strings. I have one suggestion that might clean up the API and make some other things possible or easier. >In the context specifically of the proposed new Header object, I propose >that we have a StringHeader and a BytesHeader, and an API that looks >something like this: > >StringHeader > > properties: > raw_header (None unless from_full_header was used) > raw_name > raw_value > name > value > > __init__(name, value) > from_full_header(header) > serialize(max_line_len=78, > newline='\n', > use_raw_data_if_possible=False) > encode(charset='utf-8') > >BytesHeader would be exactly the same, with the exception of the signature >for serialize and the fact that it has a 'decode' method rather than an >'encode' method. Serialize would be different only in the fact that >it would have an additional keyword parameter, must_be_7bit=True. The one thing that I think is unwieldy is the signature of the serialize() and deserialize() methods. I've been thinking about "policy" objects that can be used to control formatting and I think that perhaps substituting an API like this might work: serialize(policy=None) deserialize(policy=None) The idea is that the policy object would describe how and when to fold header lines, what EOL characters to use, but also such choices such as whether to use raw data if possible, and must_be_7bit. A first order improvement is that it would be much easier to pass the policy object up and down the call stack than a slew of independent parameters. Further, it might be interesting to allow policy objects in the generator, which would control default formatting options, and on Message objects in the hierarchy which would control formatting for that Message and all the ones below it in the tree (unless overridden by a policy object on a sub-message). Maybe headers themselves also support policy objects. I think this could be interesting for supporting output of the same message tree to different destinations. E.g. if the message is being output directly to an SMTP server, you'd stick a policy object on there that had the RFC 5321 required EOL, but you'd have a different policy object for output to a web server. >(Encoding or decoding a Message would cause the Message to recursively >encode or decode its subparts. This means you are making a complete >new copy of the Message in memory. If you don't want to do that you >can walk the Message and convert it piece by piece (we could provide a >generator that does this).) It sounds like there's overlap between the encoding/decoding API and the serialize/deserialize API. Are you thinking along those lines? Differences in signature could be papered over with the policy objects. >Subclasses of these classes for structured headers would have additional >methods that would return either specialized object types (datetimes, >address objects) or bytes/strings, and these may or may not exist in >both Bytes and String forms (that depends on the use cases, I think). Is it crackful to think about the policy object also containing a MIME type registry for conversion to the specialized object types? >So, those are my thoughts, and I'm sure I haven't thought of all the >corner cases. The biggest question is, does it seem like this general >scheme is worth pursuing? Definitely! I think it's a great idea. -Barry
signature.asc
Description: PGP signature
_______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com