On approximately 10/10/2009 5:47 PM, came the following characters from the keyboard of Stephen J. Turnbull:
Glenn Linderman writes:
> On approximately 10/10/2009 8:40 AM, came the following characters from > the keyboard of Stephen J. Turnbull:

 > > So why are we discussing this?  We don't even know what our mainline
 > > APIs are going to look like, why are we discussing forcibly operating
 > > on broken input?
> > Use case generation. If the only way to access header values is to > successfully, fully, decode them, then some uses may be rendered > impossible, or at least difficult, even by choice of APIs.

Since invertibility is a requirement, "successfully fully decoding" a
header field is not a prerequisite to accessing it.

The question of "what should we do about broken mail" at this point
has three components:

(1) To what level do we (ie, the email module) promise to parse
    conforming wire format into useful objects?

(2) For nonconforming input, when is it OK to raise an error and
    return to the calling client rather than handle it ourselves?

(3) What is the API for accessing and/or mutating unparsed data, and
    requesting a reparse?

I don't think we should go any farther than that.

I agree with your three components; but I think the answer to (3) requires discussion/speculation of what clients might want to to when faced with errors, otherwise the API won't likely help them much, without reimplementing email package logic. It is easy to design "sufficient", but unhelpful, APIs. So I've been willing to discuss such things. Maybe at too much length, and maybe with insufficient clarity that that is what I'm discussing, for which I apologize. But I don't think that not discussing it helps to answer (3).

 > > "Re" is a Latin abbreviation; there is no appropriate translation. ;-)
> > > > Nonetheless, I have seen both Re: and Fwd: translated to other languages > (besides Latin or geek) :)

Sure.  This is an aspect of question (1): is this the responsibility
of the email module?

I don't think the old RFCs even discuss the use of Re: and Fwd:, nor whether they should be collapsed or translated, or even used at all. Just checked: RFC 822 had an example that showed Re:, but RFC 2822 does discuss it a bit, and suggests not adding duplicate Re:. Fwd: is not mentioned at all, in those two RFCs. So no, adding and collapsing Re:/Fwd: is not the responsibility of the email package. But making it easy to do so, might be, as it is a common client operation. Lots of email style guides discuss it.

 > > Maybe they are, but the email module doesn't know or care about what
 > > they do.  Let's stick within what the email module is supposed to
 > > handle
> > Yep, this is just use case exploration.

But since by definition this is broken input, discussing what
applications are going to want to do with it is inappropriate, IMO.
We don't care if the app is going to prefix, suffix, or crucifix it.
We need to specify

(a) what object will hold the raw data we couldn't handle
(b) how a calling client can retrieve the raw data
(c) how the client can replace (or more generally mutate) that data
(d) how the client can request a reparse from us if it attempted to
    repair the breakage at a low level rather than parse it

Manipulations of text or bytes are in principle not the responsibility
of the email module IMO; that will be done *by* the client *using* raw
Python, not methods provided by email.  I don't see how discussion of
*what* manipulations can be done with one hand up our nose is anything
but useless bikeshedding.

If we decide that the email module can usefully provide sufficiently
general facilities that would be convenient and hard to implement by
general client programmers (eg, the Mailman Developers collective
wisdom about foreign equivalents for "re" and "fwd" is surely greater
than that of the average American programmer), we will do it by
calling low-level methods to get and put the data, and raw Python to
manipulate it as text or bytes

Except it may be perfectly valid input using a standard that post-dates the application. Doing something reasonable with it is appropriate. The email RFCs go to great lengths to make new features work reasonably in old clients that have limited understanding; with fallback interpretations for unknown MIME subtypes and even MIME types, and ensuring that some type of reasonable interpretation might be done. The RFCs define ways that new MIME types and subtypes might be defined, and new charsets, it seems reasonable to attempt to accommodate the possibility that such may actually be defined in the future.

If we don't discuss some of the possibilities, we'll never learn enough to "decide that the email module can usefully provide sufficiently general facilities that would be convenient and hard to implement by general client programmers" :)

To me, "hard" would mean that they would have to rewrite portions of logic that already exists in the email package, and then tweak it slightly to compensate for not-quite-perfect data, or maybe I should switch to saying "not-quite-perfect-or-possibly-later-standardized data" :)

--
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

_______________________________________________
Email-SIG mailing list
Email-SIG@python.org
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com

Reply via email to