Ralf Schlatterbeck <r...@runtux.com> added the comment:

Fine, I see what you mean, this involves very careful reading of the RFC
and could have been a little more verbose ...

Right. Should have been a ')'

> Adding the RFC tests would be great (patches gladly accepted).  Fixes
> for ones we fail would be great, too, but at the very least we can
> mark them as expected failures.  I don't usually like adding tests
> that we expect to fail, but in the case of externally defined tests
> such as the RFC examples I think it is worthwhile, so that we can
> check in a complete test set.

Patch attached (against current tip, 74241:120a79b8bb11). We currently
fail *all* of the tests in the RFC due to the same problem, the closing
')', I've marked them accordingly.

I've made the 5th test (with newline in the string) two cases, one with
\r\n for the newline, one with only \n. They fail differently.

I plan to look into this a little more, my current plan is to make the
outer regex non-greedy (if possible) and remove the trailing whitespace.
That would involve parsing (and ignoring) additional whitespace
*between* encoded words but not at the boundary to a non-encoded word.

Any objections/further infos?

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   http://www.runtux.com
Reichergasse 131, A-3411 Weidling       email: off...@runtux.com
osAlliance member                       email: r...@osalliance.com

----------
Added file: http://bugs.python.org/file24130/python.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1079>
_______________________________________
diff -r 120a79b8bb11 Lib/test/test_email/test_email.py
--- a/Lib/test/test_email/test_email.py Tue Jan 03 06:26:13 2012 +0200
+++ b/Lib/test/test_email/test_email.py Tue Jan 03 16:16:09 2012 +0100
@@ -2056,6 +2056,67 @@
         self.assertEqual(decode_header(s),
                         [(b'andr\xe9=zz', 'iso-8659-1')])
 
+    @unittest.expectedFailure
+    def test_rfc2047_rfc2047_1(self):
+        # 1st testcase at end of rfc2047
+        s = '(=?ISO-8859-1?Q?a?=)'
+        self.assertEqual(decode_header(s),
+            [(b'(', None), (b'a', 'iso-8859-1'), (b')', None)])
+
+    @unittest.expectedFailure
+    def test_rfc2047_rfc2047_2(self):
+        # 2nd testcase at end of rfc2047
+        s = '(=?ISO-8859-1?Q?a?= b)'
+        self.assertEqual(decode_header(s),
+            [(b'(', None), (b'a', 'iso-8859-1'), (b' b)', None)])
+
+    @unittest.expectedFailure
+    def test_rfc2047_rfc2047_3(self):
+        # 3rd testcase at end of rfc2047
+        s = '(=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=)'
+        self.assertEqual(decode_header(s),
+            [(b'(', None), (b'a', 'iso-8859-1'), (b'b', 'iso-8859-1'),
+             (b')', None)])
+
+    @unittest.expectedFailure
+    def test_rfc2047_rfc2047_4(self):
+        # 4th testcase at end of rfc2047
+        s = '(=?ISO-8859-1?Q?a?=  =?ISO-8859-1?Q?b?=)'
+        self.assertEqual(decode_header(s),
+            [(b'(', None), (b'a', 'iso-8859-1'), (b'b', 'iso-8859-1'),
+             (b')', None)])
+
+    @unittest.expectedFailure
+    def test_rfc2047_rfc2047_5a(self):
+        # 5th testcase at end of rfc2047 newline is \r\n
+        s = '(=?ISO-8859-1?Q?a?=\r\n    =?ISO-8859-1?Q?b?=)'
+        self.assertEqual(decode_header(s),
+            [(b'(', None), (b'a', 'iso-8859-1'), (b'b', 'iso-8859-1'),
+             (b')', None)])
+
+    @unittest.expectedFailure
+    def test_rfc2047_rfc2047_5b(self):
+        # 5th testcase at end of rfc2047 newline is \n
+        s = '(=?ISO-8859-1?Q?a?=\n    =?ISO-8859-1?Q?b?=)'
+        self.assertEqual(decode_header(s),
+            [(b'(', None), (b'a', 'iso-8859-1'), (b'b', 'iso-8859-1'),
+             (b')', None)])
+
+    @unittest.expectedFailure
+    def test_rfc2047_rfc2047_6(self):
+        # 6th testcase at end of rfc2047
+        s = '(=?ISO-8859-1?Q?a_b?=)'
+        self.assertEqual(decode_header(s),
+            [(b'(', None), (b'a b', 'iso-8859-1'), (b')', None)])
+
+    @unittest.expectedFailure
+    def test_rfc2047_rfc2047_7(self):
+        # 7th testcase at end of rfc2047
+        s = '(=?ISO-8859-1?Q?a?= =?ISO-8859-2?Q?_b?=)'
+        self.assertEqual(decode_header(s),
+            [(b'(', None), (b'a', 'iso-8859-1'), (b' b', 'iso-8859-2'),
+             (b')', None)])
+
 
 # Test the MIMEMessage class
 class TestMIMEMessage(TestEmailBase):
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to