On Mon, 15 Oct 2012 08:34:10 -0700, "Jordan Hayes" <jmha...@j-o-r-d-a-n.com> wrote: > > I believe these issues have been dealt with, and that > > we are now RFC 2822 (and 5322, for that matter) compliant. > > Great! Any chance of this making it into a patch for 2.1.15 ...? > > > The only thing that continuation_ws is used for now is when > > doing line wrapping on RFC 2047 tokens. (And it now defaults > > to a space, which may or may not be optimal, but certainly works.) > > I thought the issue is that there should be no character at all: 2822 > says to perform folding you simply insert a CRLF *before any whitespace* > so that unfolding is simply a matter of removing the CRLF. > > Maybe an example would help. Here's a header line: > > Subject: This is overkill to fold, but legal > > Here's one way to fold it: > > Subject: This is overkill > to fold, but legal > > Because the space between "overkill" and "to" is valid whitespace, it's > also valid as a signal that the second line is a continuation of the > first. You don't have to insert another space (or as it does presently, > a tab!), you just have to insert CRLF. Likewise on the way out, just > remove the CRLF.
Right, and the Python3 email package does exactly that. I don't remember exactly which version I made which fixes in, but I remember Barry changed tab to space in 3.1, and I just checked and it looks like I made the fix that preserves the existing whitespace instead of using continuation_ws in 3.2 (I rewrote and simplified the old wrapping algorithm). The place continuation_ws is still used is when you feed a non-ASCII string that would result in a line longer than the line length to Header.append. In that case, RFC 2047 instructs us to break up the encoded word, inserting whitespace between the pieces, such that no line containing encoded words is longer than 76 characters (without the CR/LF). So, this is the one place where we are *required* to insert whitespace, which we are then required to remove when decoding. And we still do this; but as I said, that's the only place left where we actually use the value of continuation_ws. Everywhere else we just insert CR/LF as needed in front of existing whitespace[*]. --David [*] There's a caveat in the comments about this: because the pre-3.3 code had no real idea of the syntax of the headers, it may theoretically chose to insert a CR/LF in front of whitespace that is not actually legal folding whitespace. This is, however, very unlikely, since there is very little (if any?) whitespace that is *not* legal folding whitespace. _______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com