Re: [Python-Dev] Completing the email6 API changes.

2013-09-03 Thread Stephen J. Turnbull
R. David Murray writes:

 > I meant "a text/plain root part *inside* a multipart/alternative", which
 > is what you said, I just didn't understand it at first :)  Although I
 > wonder how many GUI MUAs do the fallback to multipart/mixed with just a
 > normal text/plain root part, too.  I would expect a text-only MUA would,
 > since it has no other way to display a multipart/related...but a
 > graphical MUA might just assume that there will always be an html part
 > in a multipart/related.

It's not really a problem with text vs. GUI, or an assumption of HMTL.
There are plenty of formats that have such links, and some which don't
have links, but rather assigned roles such as "Mac files" (with data
fork and resource fork) and digital signatures (though that turned out
to be worth designing a new multipart subtype).

The problem is that "multipart/related" says "pass all the part
entities to the handler appropriate to the root part entity, which
will process the links found in the root part entity".  If you
implement that in the natural way, you just pass the text/plain part
to the text/plain handler, which won't find any links for the simple
reason that it has no protocol for representing them.

This means that the kind of multipart/related handler I envision needs
to implement linking itself, rather than delegate them to the root
part handler.  This requires checking the type of the root part:

# not intended to look like Email API
def handle_multipart_related (part_list, root_part):
if root_part.content_type in ['text/plain']:
# just display the parts in order
handle_multipart_mixed (part_list)
else:
# cid -> entities in internal representation
entity_map = extract_entity_map(part_list)
root_part.content_type.handle(root_part, entity_map)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-03 Thread R. David Murray
On Tue, 03 Sep 2013 10:01:42 -0400, "R. David Murray"  
wrote:
> On Tue, 03 Sep 2013 10:56:36 +0900, "Stephen J. Turnbull" 
>  wrote:
> > R. David Murray writes:
> >  > I can understand the structure Glen found in Applemail:
> >  > a series of text/plain parts interspersed with image/jpg, with all parts
> >  > after the first being marked 'Contentent-Disposition: inline'.  Any MUA
> >  > that can display text and images *ought* to handle that correctly and
> >  > produce the expected result.  But that isn't what your structure above
> >  > would produce.  If you did:
> >  > 
> >  > multipart/related
> >  > multipart/alternative
> >  > text/html
> >  > text/plain
> >  > image/png
> >  > text/plain
> >  > image/png
> >  > text/plain
> >  > 
> >  > and only referred to the png parts in the text/html part and marked all
> >  > the parts as 'inline' (even though that is irrelevant in the text/html
> >  > related case), an MUA that *knew* about this technique *could* display it
> >  > "correctly", but an MUA that is just following the standards most
> >  > likely won't.
> > 
> > OK, I see that now.  It requires non-MIME information about the
> > treatment of the root entity by the implementation.  On the other
> > hand, it shouldn't *hurt*.  RFC 2387 explicitly specifies that at
> > least some parts of a contained multipart/related part should be able
> > to refer to entities related via the containing multipart/related.
> > Since it does not mention *any* restrictions on contained root
> > entities, I take it that it implicitly specifies that any contained
> > multipart may make such references.  But I suspect it's not
> > implemented by most MUAs.  I'll have to test.
> 
> OK, I see what you are driving at now.  Whether or not it works is
> dependent on whether or not typical MUAs handle a multipart/related with
> a text/plain root part by treating it as if it were a multipart/mixed

I meant "a text/plain root part *inside* a multipart/alternative", which
is what you said, I just didn't understand it at first :)  Although I
wonder how many GUI MUAs do the fallback to multipart/mixed with just a
normal text/plain root part, too.  I would expect a text-only MUA would,
since it has no other way to display a multipart/related...but a
graphical MUA might just assume that there will always be an html part
in a multipart/related.

> with inline or attachment sub-parts.  So yes, whether or not we should
> support and/or document this technique very much depends on whether or
> not typical MUAs do so.  I will, needless to say, be very interested in
> the results of your research :)
> 
> --David
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/rdmurray%40bitdance.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-03 Thread R. David Murray
On Tue, 03 Sep 2013 10:56:36 +0900, "Stephen J. Turnbull"  
wrote:
> R. David Murray writes:
>  > I can understand the structure Glen found in Applemail:
>  > a series of text/plain parts interspersed with image/jpg, with all parts
>  > after the first being marked 'Contentent-Disposition: inline'.  Any MUA
>  > that can display text and images *ought* to handle that correctly and
>  > produce the expected result.  But that isn't what your structure above
>  > would produce.  If you did:
>  > 
>  > multipart/related
>  > multipart/alternative
>  > text/html
>  > text/plain
>  > image/png
>  > text/plain
>  > image/png
>  > text/plain
>  > 
>  > and only referred to the png parts in the text/html part and marked all
>  > the parts as 'inline' (even though that is irrelevant in the text/html
>  > related case), an MUA that *knew* about this technique *could* display it
>  > "correctly", but an MUA that is just following the standards most
>  > likely won't.
> 
> OK, I see that now.  It requires non-MIME information about the
> treatment of the root entity by the implementation.  On the other
> hand, it shouldn't *hurt*.  RFC 2387 explicitly specifies that at
> least some parts of a contained multipart/related part should be able
> to refer to entities related via the containing multipart/related.
> Since it does not mention *any* restrictions on contained root
> entities, I take it that it implicitly specifies that any contained
> multipart may make such references.  But I suspect it's not
> implemented by most MUAs.  I'll have to test.

OK, I see what you are driving at now.  Whether or not it works is
dependent on whether or not typical MUAs handle a multipart/related with
a text/plain root part by treating it as if it were a multipart/mixed
with inline or attachment sub-parts.  So yes, whether or not we should
support and/or document this technique very much depends on whether or
not typical MUAs do so.  I will, needless to say, be very interested in
the results of your research :)

--David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-02 Thread Stephen J. Turnbull
R. David Murray writes:

 > I'm still not understanding how the text/plain part *refers* to the
 > related parts.

Like this: "Check out this picture of my dog!"  Or this: "The terms of
the contract are found in the attached PDF.  Please print it and sign
it, then return it by carrier pigeon (attached)."  With this structure

multipart/alternative
text/plain
multipart/related
text/html
application/pdf
application/rfc6214-transport

the rendering of the text/plain part will not show evidence of the PDF
at all (eg, a view/download button), at least in some of the MUAs I've
tested.  And it *should* not, in an RFC-conforming MUA.

 > I can understand the structure Glen found in Applemail:
 > a series of text/plain parts interspersed with image/jpg, with all parts
 > after the first being marked 'Contentent-Disposition: inline'.  Any MUA
 > that can display text and images *ought* to handle that correctly and
 > produce the expected result.  But that isn't what your structure above
 > would produce.  If you did:
 > 
 > multipart/related
 > multipart/alternative
 > text/html
 > text/plain
 > image/png
 > text/plain
 > image/png
 > text/plain
 > 
 > and only referred to the png parts in the text/html part and marked all
 > the parts as 'inline' (even though that is irrelevant in the text/html
 > related case), an MUA that *knew* about this technique *could* display it
 > "correctly", but an MUA that is just following the standards most
 > likely won't.

OK, I see that now.  It requires non-MIME information about the
treatment of the root entity by the implementation.  On the other
hand, it shouldn't *hurt*.  RFC 2387 explicitly specifies that at
least some parts of a contained multipart/related part should be able
to refer to entities related via the containing multipart/related.
Since it does not mention *any* restrictions on contained root
entities, I take it that it implicitly specifies that any contained
multipart may make such references.  But I suspect it's not
implemented by most MUAs.  I'll have to test.

 > I don't see any way short of duplicating the image parts to make it
 > likely that a typical MUA would display images for both a text/plain
 > sequence and a text/html related part.  On the other hand, my experience
 > with MUAs is actually quite limited :)
 > 
 > Unless there is some standard for referring to related parts in a
 > text/plain part?

No, the whole point is that we MUA implementers *know* that there is
no machine-parsable way to refer to the related parts in text/plain,
and therefore the only way to communicate even the *presence* of the
attachment in

multipart/related
text/plain
image/jpeg; name="dog-photo.jpg"

to the receiving user is to make an exception in the implementation
and treat it as multipart/mixed.[1]

It *does* make sense, i.e., doesn't require any information not
already available to the implementation.

I wonder if use of external bodies could avoid the duplication in
current implementations.  Probably too fragile, though.

Footnotes: 
[1]  This is conformant to the RFC, as the mechanism of "relation" is
explicitly application-dependent.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-02 Thread R. David Murray
On Mon, 02 Sep 2013 15:52:59 -0700, Glenn Linderman  
wrote:
> MUAs tend to be able to display what they produce themselves, but I have 
> situations where they don't handle what other MUAs produce.
> 
> One nice thing about this email6 toolkit might be the ability to 
> produce, more easily than before, a variety of MIME combinations to 
> exercise and test a variety of MUAs. While perhaps most of them have 
> been tested with some obviously standard MIME combinations, I suspect 
> most of them will produce strange results with combinations that are out 
> of the ordinary.

Yeah, RFC compliance and other types of testing is something I want this
package to be good for.  The API under discussion here, though, is
oriented toward people using the library for easily generating emails
from their application and/or easily accessing the information from
emails sent to their application.

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-02 Thread Glenn Linderman

On 9/2/2013 2:40 PM, R. David Murray wrote:

I'm still not understanding how the text/plain part*refers*  to the
related parts.
I don't think the text/plain part can refer to the related parts, but, 
like you, am willing to be educated if there is a way; but while the 
text/html may be able to if things like cid: URIs can reach up a level 
in a given MUA, the text/plain would be left with the additional parts 
being attachments, methinks. This is less interesting than the technique 
Apple Mail uses, but more interesting than not even seeing the attached 
pictures.


MUAs tend to be able to display what they produce themselves, but I have 
situations where they don't handle what other MUAs produce.


One nice thing about this email6 toolkit might be the ability to 
produce, more easily than before, a variety of MIME combinations to 
exercise and test a variety of MUAs. While perhaps most of them have 
been tested with some obviously standard MIME combinations, I suspect 
most of them will produce strange results with combinations that are out 
of the ordinary.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-02 Thread R. David Murray
On Mon, 02 Sep 2013 16:06:53 +0900, "Stephen J. Turnbull"  
wrote:
> > Glenn writes:
> > > Steve writes:
> 
> >> OTOH, if the message is structured
> >>
> >>  multipart/related
> >>  multipart/alternative
> >>  text/plain
> >>  text/html
> >>  image/png
> >>  image/png
> >>
> >> the receiver can infer that the images are related to both text/*
> >> parts and DTRT for each.
> 
> >With the images being treated as attachments. Or is there a syntax to 
> >allow the text/html to embed the images and the text/plain to see them 
> >as attachments?
> 
> I believe the above is that syntax.  But the standard doesn't say
> anything about this.  The standard for multipart/alternative is RFC
> 2046, which doesn't know about multipart/related.  RFC 2387 doesn't
> update RFC 2046, so it doesn't say anything about
> multipart/alternative within multipart/related, either.
> 
> >I think the text/html wants to refer to things within its containing
> >multipart/related, but am not sure if that allows the intervening
> >multipart/alternative.
> 
> I don't see why not.  But it would depend on the implementations,
> which we'll have to test before recommending the structure I
> (theoretically :-) prefer.e

I'm still not understanding how the text/plain part *refers* to the
related parts.  I can understand the structure Glen found in Applemail:
a series of text/plain parts interspersed with image/jpg, with all parts
after the first being marked 'Contentent-Disposition: inline'.  Any MUA
that can display text and images *ought* to handle that correctly and
produce the expected result.  But that isn't what your structure above
would produce.  If you did:

multipart/related
multipart/alternative
text/html
text/plain
image/png
text/plain
image/png
text/plain

and only referred to the png parts in the text/html part and marked all
the parts as 'inline' (even though that is irrelevant in the text/html
related case), an MUA that *knew* about this technique *could* display it
"correctly", but an MUA that is just following the standards most
likely won't.

I don't see any way short of duplicating the image parts to make it
likely that a typical MUA would display images for both a text/plain
sequence and a text/html related part.  On the other hand, my experience
with MUAs is actually quite limited :)

Unless there is some standard for referring to related parts in a
text/plain part?  I'm not aware of any, but you have much more experience
with this stuff than I do.  (Even text/enriched (RFC 1896) doesn't seem
to have one, though of course there could be "extensions" that
define both that and the font support you used as an example.)

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-02 Thread Stephen J. Turnbull
> Glenn writes:
> > Steve writes:

>> OTOH, if the message is structured
>>
>>  multipart/related
>>  multipart/alternative
>>  text/plain
>>  text/html
>>  image/png
>>  image/png
>>
>> the receiver can infer that the images are related to both text/*
>> parts and DTRT for each.

>With the images being treated as attachments. Or is there a syntax to 
>allow the text/html to embed the images and the text/plain to see them 
>as attachments?

I believe the above is that syntax.  But the standard doesn't say
anything about this.  The standard for multipart/alternative is RFC
2046, which doesn't know about multipart/related.  RFC 2387 doesn't
update RFC 2046, so it doesn't say anything about
multipart/alternative within multipart/related, either.

>I think the text/html wants to refer to things within its containing
>multipart/related, but am not sure if that allows the intervening
>multipart/alternative.

I don't see why not.  But it would depend on the implementations,
which we'll have to test before recommending the structure I
(theoretically :-) prefer.e
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-01 Thread Glenn Linderman

On 9/1/2013 8:03 PM, Stephen J. Turnbull wrote:

This is getting off-topic IMO; we should probably take this thread to
email-sig.


Probably, but you didn't :)


Glenn Linderman writes:

  > I recall being surprised when first seeing messages generated by
  > Apple Mail software, that are multipart/related, having a sequence
  > of intermixed text/plain and image/jpeg parts. This is apparently
  > how Apple Mail generates messages that have inline pictures,
  > without resorting to use of HTML mail.

(Are you sure you mean "text/plain" above?  I've not seen this form of
message.  And you mention only "text/html" below.)


Yes, I'm sure it was text/plain. I may be able to access the archived 
discussion from a non-Python mailing list about it, to verify, if that 
becomes important.  But now that you mention mulitpart/mixed, I'm not 
sure if it was multipart/related or mulitpart/mixed for the grouping 
MIME part. Perhaps someone with Apple Mail could produce one... probably 
by composing a message as text/plain, and dragging in a picture or two.


The other references to text/html was in error.


This practice (like my suggestion) is based on the conjecture that
MUAs that implement multipart/related will treat it as multipart/mixed
if the "main" subpart isn't known to implement links to external
entities.

  > Other email clients handle this relatively better or worse,
  > depending on the expectations of their authors!

Sure.  After all, this is a world in which some MUAs have a long
history of happily executing virus executables.

  > I did attempt to determine if it was non-standard usage: it is
  > certainly non-common usage, but I found nothing in the email/MIME
  > RFCs that precludes such usage.

Clearly RFCs 2046 and 2387 envision a fallback to multipart/mixed, but
are silent on how to do it for MUAs that implement multipart/related.
RFC 2387 says:

 MIME User Agents that do recognize Multipart/Related entities but
 are unable to process the given type should give the user the
 option of suppressing the entire Multipart/Related body part shall
 be. [...]  Handling Multipart/Related differs [from handling of
 existing composite subtypes] in that processing cannot be reduced
 to handling the individual entities.

I think that the sane policy is that when processing multipart/related
internally, the MUA should treat the whole as multipart/mixed, unless
it knows how links are implemented in the "start" part.  But the RFC
doesn't say that.

  > Several of them treat all the parts after the initial text/html
  > part as attachments;

They don't implement RFC 2387 (promoted to draft standard in 1998,
following two others, the earlier being RFC 1872 from 1995).  Too bad
for their users.


Correct... but the MUA receiving the Apple Mail message I was talking 
about being a text-mostly MUA, it is probably a reasonable method of 
handling them.



But what I'm worried about is a different issue,
which is how to ensure that multipart/alternative messages present all
relevant content entities in both presentations.  For example, the
following hypothetical structure is efficient:

 multipart/alternative
 text/plain
 multipart/related
 text/html
 application/x-opentype-font

because the text/plain can't use the font.  But this

 multipart/alternative
 text/plain
 multipart/related
 text/html
 image/png
 image/png

often cost the text/plain receiver a view of the images, and I don't
see any way to distinguish the two cases.  (The images might be
character glyphs, for example, effectively a "poor man's font".)


Yes, that issue is handled by some text MUA by showing the image/png (or 
anything in such a position) as attachments. Again, being text-mostly, 
that might be a reasonable way of handling them. Perhaps the standard 
says they should be ignored, when displaying text/plain alternative.



OTOH, if the message is structured

 multipart/related
 multipart/alternative
 text/plain
 text/html
 image/png
 image/png

the receiver can infer that the images are related to both text/*
parts and DTRT for each.

With the images being treated as attachments. Or is there a syntax to 
allow the text/html to embed the images and the text/plain to see them 
as attachments?  I think the text/html wants to refer to things within 
its containing multipart/related, but am not sure if that allows the 
intervening multipart/alternative.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-01 Thread Stephen J. Turnbull
This is getting off-topic IMO; we should probably take this thread to
email-sig.

Glenn Linderman writes:

 > I recall being surprised when first seeing messages generated by
 > Apple Mail software, that are multipart/related, having a sequence
 > of intermixed text/plain and image/jpeg parts. This is apparently
 > how Apple Mail generates messages that have inline pictures,
 > without resorting to use of HTML mail.

(Are you sure you mean "text/plain" above?  I've not seen this form of
message.  And you mention only "text/html" below.)

This practice (like my suggestion) is based on the conjecture that
MUAs that implement multipart/related will treat it as multipart/mixed
if the "main" subpart isn't known to implement links to external
entities.

 > Other email clients handle this relatively better or worse,
 > depending on the expectations of their authors!

Sure.  After all, this is a world in which some MUAs have a long
history of happily executing virus executables.

 > I did attempt to determine if it was non-standard usage: it is
 > certainly non-common usage, but I found nothing in the email/MIME
 > RFCs that precludes such usage.

Clearly RFCs 2046 and 2387 envision a fallback to multipart/mixed, but
are silent on how to do it for MUAs that implement multipart/related.
RFC 2387 says:

MIME User Agents that do recognize Multipart/Related entities but
are unable to process the given type should give the user the
option of suppressing the entire Multipart/Related body part shall
be. [...]  Handling Multipart/Related differs [from handling of
existing composite subtypes] in that processing cannot be reduced
to handling the individual entities.

I think that the sane policy is that when processing multipart/related
internally, the MUA should treat the whole as multipart/mixed, unless
it knows how links are implemented in the "start" part.  But the RFC
doesn't say that.

 > Several of them treat all the parts after the initial text/html
 > part as attachments;

They don't implement RFC 2387 (promoted to draft standard in 1998,
following two others, the earlier being RFC 1872 from 1995).  Too bad
for their users.  But what I'm worried about is a different issue,
which is how to ensure that multipart/alternative messages present all
relevant content entities in both presentations.  For example, the
following hypothetical structure is efficient:

multipart/alternative
text/plain
multipart/related
text/html
application/x-opentype-font

because the text/plain can't use the font.  But this

multipart/alternative
text/plain
multipart/related
text/html
image/png
image/png

often cost the text/plain receiver a view of the images, and I don't
see any way to distinguish the two cases.  (The images might be
character glyphs, for example, effectively a "poor man's font".)
OTOH, if the message is structured

multipart/related
multipart/alternative
text/plain
text/html
image/png
image/png

the receiver can infer that the images are related to both text/*
parts and DTRT for each.

Steve

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-01 Thread Glenn Linderman

On 9/1/2013 3:10 PM, R. David Murray wrote:

This doesn't work, though, because you could (although you usually
won't) have more than one 'text/html' part in a single multipart.


I was traveling and your original message is still unread in my queue of 
"things to look at later" :(  I haven't caught up with old stuff yet, 
but am trying to stay current on current stuff...


The quoted issue was mentioned in another message in this thread, though 
in different terms.


I recall being surprised when first seeing messages generated by Apple 
Mail software, that are multipart/related, having a sequence of 
intermixed text/plain and image/jpeg parts. This is apparently how Apple 
Mail generates messages that have inline pictures, without resorting to 
use of HTML mail. Other email clients handle this relatively better or 
worse, depending on the expectations of their authors! Several of them 
treat all the parts after the initial text/html part as attachments; 
some of them display inline attachments if they are text/html or 
image/jpeg and others do not. I can't say for sure if there are other 
ways they are treated; I rather imagine that Apple Mail displays the 
whole message with interspersed pictures quite effectively, without 
annoying the user with attachment "markup", but I'm not an Apple Mail 
user so I couldn't say for sure.


You should, of course, ensure that it is possible to create such a message.

Whether Apple Mail does that with other embedded image/* formats, or 
with other text/* formats, or other non-image, non-text formats, I 
couldn't say. I did attempt to determine if it was non-standard usage: 
it is certainly non-common usage, but I found nothing in the email/MIME 
RFCs that precludes such usage.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-09-01 Thread R. David Murray
On Sat, 31 Aug 2013 18:57:56 +0900, "Stephen J. Turnbull"  
wrote:
> R. David Murray writes:
> 
>  > But I would certainly appreciate review from anyone so moved, since I
>  > haven't gotten any yet.
> 
> I'll try to make time for a serious (but obviously partial) review by
> Monday.
> 
> I don't know if this is "serious" bikeshedding, but I have a comment
> or two on the example:
> 
>  > from email.message import MIMEMessage
>  > from email.headerregistry import Address
>  > fullmsg = MIMEMessage()
>  > fullmsg['To'] = Address('Foő Bar', 'f...@example.com')
>  > fullmsg['From'] = "mé¨ "
>  > fullmsg['Subject'] = "j'ai un probléme de python."
> 
> This is very nice!  *I* *love* it.
> 
> But (sorry!) I worry that it's not obvious to "naive" users.  Maybe it
> would be useful to have a Message() factory which has one semantic
> difference from MIMEMessage: it "requires" RFC 822-required headers
> (or optionally RFC 1036 for news).  Eg:
> 
> # This message will be posted and mailed
> # These would conform to the latest Draft Standards
> # and be DKIM-signed
> fullmsg = Message('rfc822', 'rfc1036', 'dmarc')
> 
> I'm not sure how "required" would be implemented (perhaps through a
> .validate() method).  So the signature of the API suggested above is
> Message(*validators, **kw).

Adding new constructor arguments to the existing Message class is
possible.  However, given the new architecture, the more logical way
to do this is to put it in the policy.  So currently the idea would be
for this to be spelled like this:

fullmsg = Message(policy=policy.SMTP+policy.strict)

Then what would happen is that when the message is serialized (be it
via str(), bytes(), by passing it to smtplib.sendmail or
smtplib.sendmessage, or by an explicit call to a Generator), an
error would be raised if the minimum required headers are not
present.

As I said in an earlier message, currently there's no extensibility
mechanism for the validation.  If the parser recognizes a defect, whether
or not an error is raised is controlled by the policy.  But there's
no mechanism for adding new defect checking that the parser doesn't
already know about, or for issues that are not parse-time defects.
(There is currently one non-parsing defect for which there is a custom
control: the maximum number of headers of a given type that are allowed
to be added to a Message object.)

So we need some way to add additional constraints as well.  Probably a
list of validation functions that take a Message/MIMEPart as the
argument and do a raise if they want to reject the message.

The tricky bit is that currently raise_on_defect means you get an error
as soon as a (parsing) defect is discovered.  Likewise, if max_count
is being enforced for headers, the error is raised as soon as the
duplicate header is added.

Generating errors early when building messages was one of or original
design goals, and *only* detecting problems via validators runs counter
to that unless all the validators are called every time an operation
is performed that modifies a message.  Maybe that would be OK, but it
feels ugly.

For the missing header problem, the custom solution could be to add a
'headers' argument to Message that would allow you to write:

 fullmsg = Message(header=(
Header('Date', email.utils.localtime()),
Header('To', Address('Fred', 'a...@xyz.com')),
Header('From', Address('Sally, 'f...@xyz.com')),
Header('Subject', 'Foo'),
),
policy=policy.SMTP+policy.Strict)

This call could then immediately raise an error if not all of the
required headers are present.  (Header is unfortunately not a good
choice of name here because we already have a 'Header' class that has a
different API).

Aside: I could also imagine adding a 'content' argument that would let
you generate a simple text message via a single call...which means you
could also extend this model to specifying the entire message in a single
call, if you wrote a suitable content manager function for tuples:

 fullmsg = Message(
policy=policy.SMTP+policy.Strict,
header=(
   Header('Date', datetime.datetime.now()),
   Header('To', Address('Fred', 'a...@xyz.com')),
   Header('From', Address('Sally, 'f...@xyz.com')),
   Header('Subject', 'Foo'),
   ),
content=(
(
'This is the text part',
(
  'Here is the html',
  {'image1': b'image data'},
  ),
),
b'attachment data',
)

But that is probably a little bit crazy...easier to just write a custom
function for your appl

Re: [Python-Dev] Completing the email6 API changes.

2013-09-01 Thread R. David Murray
On Sun, 01 Sep 2013 00:18:59 +0900, "Stephen J. Turnbull"  
wrote:
> R. David Murray writes:
> 
>  > Full validation is something that is currently a "future
>  > objective".
> 
> I didn't mean it to be anything else. :-)
> 
>  > There's infrastructure to do it, but not all of the necessary knowledge
>  > has been coded in yet.
> 
> Well, I assume you already know that there's no way that can ever
> happen (at least until we abandon messaging entirely): new RFCs will
> continue to be published.  So it needs to be an extensible mechanism,
> a "pipeline" of checks (Barry would say a "chain of rules", I think).

My idea was to encode as much of the current known rules as as we have
the stomach for, and to have a validation flag that you turn on if you
want to check your message against those standards.  But without that
flag the code allows you to set arbitrary parameters and headers.

As you say, an extensible mechanism for the validators is a good idea.
So I take it back that the infrastructure is in place, since extensibility
doesn't exist for that feature yet.

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-08-31 Thread Stephen J. Turnbull
R. David Murray writes:

 > Full validation is something that is currently a "future
 > objective".

I didn't mean it to be anything else. :-)

 > There's infrastructure to do it, but not all of the necessary knowledge
 > has been coded in yet.

Well, I assume you already know that there's no way that can ever
happen (at least until we abandon messaging entirely): new RFCs will
continue to be published.  So it needs to be an extensible mechanism,
a "pipeline" of checks (Barry would say a "chain of rules", I think).

Enjoy your trip!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-08-31 Thread R. David Murray
On Sat, 31 Aug 2013 18:57:56 +0900, "Stephen J. Turnbull"  
wrote:
> R. David Murray writes:
> 
>  > But I would certainly appreciate review from anyone so moved, since I
>  > haven't gotten any yet.
> 
> I'll try to make time for a serious (but obviously partial) review by
> Monday.

Thanks.

> I don't know if this is "serious" bikeshedding, but I have a comment
> or two on the example:

Yeah, you engaged in some serious bikeshedding there ;)

I like the idea of a top level part that requires the required headers,
and I agree that MIMEPart is better than MIMEMessage for that class.

Full validation is something that is currently a "future objective".
There's infrastructure to do it, but not all of the necessary knowledge
has been coded in yet.

I take your point about the relationship between related and alternative
not being set in stone.  I'll have to think through the consequences
of that, but I think it is just a matter of removing a couple error
checks and updating the documentation.

I'll also have to sit and think through your other ideas (the more
extensive bikeshedding :) before I can comment, and I'm heading out to
take my step-daughter to her freshman year of college, so I won't be
able to do thorough responses until tomorrow.

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-08-31 Thread R. David Murray
On Sat, 31 Aug 2013 20:37:30 +1000, Steven D'Aprano  wrote:
> On 31/08/13 15:21, R. David Murray wrote:
> > If you've read my blog (eg: on planet python), you will be aware that
> > I dedicated August to full time email package development.
> [...]
> 
> 
> The API looks really nice! Thank you for putting this together.

Thanks.

> A question comes to mind though:
> 
> > All input strings are unicode, and the library takes care of doing
> > whatever encoding is required.  When you pull data out of a parsed
> > message, you get unicode, without having to worry about how to decode
> > it yourself.
> 
> How well does your library cope with emails where the encoding is declared 
> wrongly? Or no encoding declared at all?

It copes as best it can :)  The bad bytes are preserved (unless you
modify a part) but are returned as the "unknown character" in a
string context.  You can get the original bytes out by using the
bytes access interface.  (There are probably some places where how
to do that isn't clear in the current API, but bascially either
you use BytesGenerator or you drop down to a lower level API.)

An attempt is made to interpret "bad bytes" as utf-8, before giving up
and replacing them with the 'unknown character' character.  I'm not 100%
sure that is a good idea.

> Conveniently, your email is an example of this. Although it contains 
> non-ASCII characters, it is declared as us-ascii:

Oh, yeah, my MUA is a little quirky and I forgot the step that
would have made that correct.  Wanting to rewrite it is one of
the reasons I embarked on this whole email thing a few years
ago :)

--David
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-08-31 Thread Stephen J. Turnbull
Steven D'Aprano writes:

 > which may explain why Stephen Turnbull's reply contains mojibake.

Nah.  It was already there, I just copied it.  Could be my MUA's
fault, though; I've tweaked it for Japanese, and it doesn't handle odd
combinations well.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Completing the email6 API changes.

2013-08-31 Thread Steven D'Aprano

On 31/08/13 15:21, R. David Murray wrote:

If you've read my blog (eg: on planet python), you will be aware that
I dedicated August to full time email package development.

[...]


The API looks really nice! Thank you for putting this together.

A question comes to mind though:


All input strings are unicode, and the library takes care of doing
whatever encoding is required.  When you pull data out of a parsed
message, you get unicode, without having to worry about how to decode
it yourself.


How well does your library cope with emails where the encoding is declared 
wrongly? Or no encoding declared at all?

Conveniently, your email is an example of this. Although it contains non-ASCII 
characters, it is declared as us-ascii:

--===1633676851==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline


which may explain why Stephen Turnbull's reply contains mojibake.



--
Steven
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com