I just posted a significant patch on issue 11492. After banging my head on the existing header folding algorithm and watching the if/else cases proliferate and break other things, I decided to try a rewrite. It is 70 lines shorter but passes all the tests plus the new ones I posted to the bug. And some additional ones.
In the new algorithm I'm changing the interpretation of RFC2822 that it implements. The old algorithm breaks on the 'splitchars' unconditionally, introducing whitespace if there isn't whitespace there already. This seems wrong to me. When 2822 talks about higher level syntactic breaks, I believe it means only such breaks where FWS is present. So the new algorithm breaks only where there is at least one tab or space, but prefers to break after the splitchars when such are followed by a tab or space. We still aren't doing it "right", because we aren't paying attention to the real syntax of structured headers, and we might inadvertently break at whitespace that is not legitimate FWS. Those case should be pretty darn rare, though, and they old algorithm could make the same mistake. The patch adjusts a few tests that were checking the old line breaking that was failing to break long lines even though they contained whitespace when they also contained splitchars. There is even a comment in one of them that says that it is wrong. Since this fixes bugs and improves RFC compliance, I plan to apply it to 3.2. (As noted in the issue, 3.1 has a test failure I don't understand...really I ought to figure it out, and perhaps I will before the time comes that I can actually apply the patch.) diffstat says the header.py portion of the patch is 107 lines added and 178 deleted, so it is a non trivial change. Reviews welcomed. --David _______________________________________________ Email-SIG mailing list Email-SIG@python.org Your options: http://mail.python.org/mailman/options/email-sig/archive%40mail-archive.com