On Thu, 20 Jan 2011 12:55:44 -0500, Bob Kline wrote: > On 1/20/2011 12:23 PM, Carl Banks wrote: >> On Jan 20, 7:08 am, Bob Kline<bkl...@rksystems.com> wrote: >>> I just noticed that the following passage in RFC 822: >>> >>> The process of moving from this folded multiple-line >>> representation of a header field to its single line >>> represen- tation is called "unfolding". Unfolding is >>> accomplished by regarding CRLF immediately followed >>> by a LWSP-char as equivalent to the LWSP-char. >>> >>> is not being honored by the email module. The following two >>> invocations of message_from_string() should return the same value, but >>> that's not what happens: >>> >>> >>> import email >>> >>> email.message_from_string("Subject: blah").get('SUBJECT') >>> 'blah' >>> >>> email.message_from_string("Subject:\n blah").get('SUBJECT') >>> ' blah' >>> >>> Note the space in front of the second value returned, but missing from >>> the first. Can someone convince me that this is not a bug? >> That's correct, according to my reading of RFC 822 (I doubt it's >> changed so I didn't bother to look up what the latest RFC on that >> subject is.) >> >> The RFC says that in a folded line the whitespace on the following line >> is considered a part of the line. > > Thanks for responding. I think your interpretation of the RFC is the > same is mine. What I'm saying is that by not returning the same value > in the two cases above the module is not "regarding CRLF immediately > followed by a LWSP-char as equivalent to the LWSP-char." > That's only a problem if your code cares about the composition of the whitespace and this, IMO is incorrect behaviour. When the separator between syntactic elements in a header is 'whitespace' it should not matter what combination of newlines, tabs and spaces make up the whitespace element.
-- martin@ | Martin Gregorie gregorie. | Essex, UK org | -- http://mail.python.org/mailman/listinfo/python-list