Re: [Email-SIG] continuation_ws in Generator and Header

2012-10-15 Thread Jordan Hayes

I believe these issues have been dealt with, and that
we are now RFC 2822 (and 5322, for that matter) compliant.


Great!  Any chance of this making it into a patch for 2.1.15 ...?


The only thing that continuation_ws is used for now is when
doing line wrapping on RFC 2047 tokens.  (And it now defaults
to a space, which may or may not be optimal, but certainly works.)


I thought the issue is that there should be no character at all: 2822 
says to perform folding you simply insert a CRLF *before any whitespace* 
so that unfolding is simply a matter of removing the CRLF.


Maybe an example would help.  Here's a header line:

Subject: This is overkill to fold, but legal

Here's one way to fold it:

Subject: This is overkill
to fold, but legal

Because the space between overkill and to is valid whitespace, it's 
also valid as a signal that the second line is a continuation of the 
first.  You don't have to insert another space (or as it does presently, 
a tab!), you just have to insert CRLF.  Likewise on the way out, just 
remove the CRLF.


So I think for Mailman, which can modify the Subject: line, what you 
need to do is first unfold the line if it's already folded; apply any 
changes; and then optionally refold, if it's now longer than you'd like 
it to be.



I addition, the new (provisional) email policy in the 3.3 email
library has a both an unfolding and a folding algorithm that are
supposed to be fully RFC 2822/5322 compliant, including that the
folding algorithm implements folding according to the RFC's syntax.
That is, it really knows where the higher level syntactic breaks are
on a per-header-type basis and folds there preferentially.


Sounds great.

Thanks,

/jordan 


___
Email-SIG mailing list
Email-SIG@python.org
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com


Re: [Email-SIG] continuation_ws in Generator and Header

2012-10-15 Thread Barry Warsaw
On Oct 15, 2012, at 08:34 AM, Jordan Hayes wrote:

 I believe these issues have been dealt with, and that
 we are now RFC 2822 (and 5322, for that matter) compliant.

Great!  Any chance of this making it into a patch for 2.1.15 ...?

That's Mailman 2.1.15, which is already out, so you probably mean 2.1.16.  But
Mailman 2.1 pretty much uses whatever is available in Python 2 - we've been
down the road before of providing a separate email package, and I think that's
problematic.

FWIW, I would *dearly* love Mailman 3 to be a Python 3 project, and even
require Python 3.3 so we could take advantage of all the nice email policy
stuff right out of the box.  I can't currently do that because enough of our
dependencies haven't yet been ported (ping me if you want to help with that :).

Cheers,
-Barry
___
Email-SIG mailing list
Email-SIG@python.org
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com


Re: [Email-SIG] continuation_ws in Generator and Header

2012-10-15 Thread R. David Murray
On Mon, 15 Oct 2012 08:34:10 -0700, Jordan Hayes jmha...@j-o-r-d-a-n.com 
wrote:
  I believe these issues have been dealt with, and that
  we are now RFC 2822 (and 5322, for that matter) compliant.
 
 Great!  Any chance of this making it into a patch for 2.1.15 ...?
 
  The only thing that continuation_ws is used for now is when
  doing line wrapping on RFC 2047 tokens.  (And it now defaults
  to a space, which may or may not be optimal, but certainly works.)
 
 I thought the issue is that there should be no character at all: 2822 
 says to perform folding you simply insert a CRLF *before any whitespace* 
 so that unfolding is simply a matter of removing the CRLF.
 
 Maybe an example would help.  Here's a header line:
 
 Subject: This is overkill to fold, but legal
 
 Here's one way to fold it:
 
 Subject: This is overkill
  to fold, but legal
 
 Because the space between overkill and to is valid whitespace, it's 
 also valid as a signal that the second line is a continuation of the 
 first.  You don't have to insert another space (or as it does presently, 
 a tab!), you just have to insert CRLF.  Likewise on the way out, just 
 remove the CRLF.

Right, and the Python3 email package does exactly that.  I don't remember
exactly which version I made which fixes in, but I remember Barry changed
tab to space in 3.1, and I just checked and it looks like I made the fix
that preserves the existing whitespace instead of using continuation_ws
in 3.2 (I rewrote and simplified the old wrapping algorithm).

The place continuation_ws is still used is when you feed a non-ASCII
string that would result in a line longer than the line length to
Header.append.  In that case, RFC 2047 instructs us to break up the
encoded word, inserting whitespace between the pieces, such that no line
containing encoded words is longer than 76 characters (without the CR/LF).
So, this is the one place where we are *required* to insert whitespace,
which we are then required to remove when decoding.  And we still do
this; but as I said, that's the only place left where we actually use
the value of continuation_ws.  Everywhere else we just insert CR/LF as
needed in front of existing whitespace[*].

--David

[*] There's a caveat in the comments about this: because the pre-3.3 code
had no real idea of the syntax of the headers, it may theoretically
chose to insert a CR/LF in front of whitespace that is not actually
legal folding whitespace.  This is, however, very unlikely, since
there is very little (if any?) whitespace that is *not* legal
folding whitespace.
___
Email-SIG mailing list
Email-SIG@python.org
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com


Re: [Email-SIG] continuation_ws in Generator and Header

2012-10-13 Thread Jordan Hayes

Just stumbled upon this today:

http://mail.python.org/pipermail/email-sig/2008-June/000394.html

Mark Sapiro writes:


I may have the urge to look at this after Mailman 2.1.11 is released.


Any urge now? :-)

Thanks,

/jordan
___
Email-SIG mailing list
Email-SIG@python.org
Your options: 
http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com