Re: Patches needed to fix tmda-cgi tracebacks

Jason R. Mastaler Tue, 10 Oct 2006 15:23:29 -0700

[EMAIL PROTECTED] (Jason R. Mastaler) writes:

> [EMAIL PROTECTED] (Jason R. Mastaler) writes:
>
>> Bernard Johnson <[EMAIL PROTECTED]> writes:
>>
>>> Header.py.encoding.patch
>>> Required in TMDA 0.91 (maybe later versions as well) to properly process
>>> incorrectly encoded headers.  I'm pretty sure this is required at least
>>> up to 1.0.3 because the code has not changed to fix the issue.
>>> http://www.symetrix.com/projects/tmda/Header.py.encoding.patch
>>
>> The next thing I plan to look at is upgrading the Python email
>> package[¹], probably to version 4.0.1, which supposedly has
>> substantial improvements and bug fixes over what we're using now.
>> Once that's done, we can see if this patch is still necessary. 
>
> Okay, the upgrade is now done.  Just glancing at the new header.py, I
> don't see many differences, so I'll bet this patch is still required.
> Bernard, can I get a sample message that exhibits this problem so I
> can reproduce/test?


Bernard sent me a sample message.  email.header.decode_header() is
raising an email.errors.HeaderParseError exception when tmda-cgi tries
to decode the Subject header (around line 500 in PendList.py).  The
raw Subject string in the message looks like this:

Subject: 
=?iso-8859-1?B?UmU6IEFXOiBCZXN0ZWxsdW5nIEJhZ3RhZ3MgZvxyIEdvbGYgQ2x1YiBL/HNzbmFjaA?=

But it should look like this:

Subject: 
=?iso-8859-1?B?UmU6IEFXOiBCZXN0ZWxsdW5nIEJhZ3RhZ3MgZvxyIEdvbGYgQ2x1YiBL/HNzbmFjaA==?=

The problem is that it's missing the two trailing padding characters
(=) in the encoded string.  RFC 3548 says implementations MUST include
appropriate pad characters at the end of encoded data.  So his mailer
is encoding headers incorrectly.

Bernard's Header.py.encoding.patch modifies the Python email lib, but
I don't think that is the correct fix.  I think this should be handled
in tmda-cgi.  tmda-cgi should decide how it wants to handle cases like
this - whether it wants to catch the exception and work around it, or
blow up as it does currently.

I looked at the tmda-cgi code briefly, and wrote a try/except clause
in PendList.py that just returns the raw encoded string if the
decoding fails.  The diff is attached.  Line 242 in View.py needs a
corresponding fix, but that code wasn't as clear to me, so I left it
alone.  So Jim, I guess you'll have to decide whether you want to
address this or not, and if so, how.

Index: PendList.py
===================================================================
--- PendList.py	(revision 2093)
+++ PendList.py	(working copy)
@@ -496,16 +496,20 @@
       if not MsgObj.msgobj["subject"]:
         Subject = "None"
       else:
-        # Decode internationalazed headers
+        # Try to decode internationalized headers
         value = ""
-        for decoded in email.Header.decode_header( MsgObj.msgobj["subject"] ):
-          if decoded[1]:
-            try:
-              value += Unicode.TranslateToUTF8(decoded[1], decoded[0], "strict")
-            except UnicodeError:
+        try:
+          for decoded in email.Header.decode_header( MsgObj.msgobj["subject"] ):
+            if decoded[1]:
+              try:
+                value += Unicode.TranslateToUTF8(decoded[1], decoded[0], "strict")
+              except UnicodeError:
+                value += Unicode.TranslateToUTF8(CharSet, decoded[0], "ignore")
+            else:
               value += Unicode.TranslateToUTF8(CharSet, decoded[0], "ignore")
-          else:
-            value += Unicode.TranslateToUTF8(CharSet, decoded[0], "ignore")
+        except email.errors.HeaderParseError:
+          # just return the undecoded string if we can't decode it
+          value = MsgObj.msgobj["subject"]
         Subject = value
         if len(Subject) > int(PVars[("PendingList", "CropSubject")]):
           Subject = \
@@ -519,16 +523,20 @@
       if not MsgObj.msgobj["from"]:
         From = ""
       else:
-        # Decode internationalazed headers
+        # Try to decode internationalized headers
         value = ""
-        for decoded in email.Header.decode_header( MsgObj.msgobj["from"] ):
-          if decoded[1]:
-            try:
-              value += Unicode.TranslateToUTF8(decoded[1], decoded[0], "strict")
-            except UnicodeError:
+        try:
+          for decoded in email.Header.decode_header( MsgObj.msgobj["from"] ):
+            if decoded[1]:
+              try:
+                value += Unicode.TranslateToUTF8(decoded[1], decoded[0], "strict")
+              except UnicodeError:
+                value += Unicode.TranslateToUTF8(CharSet, decoded[0], "ignore")
+            else:
               value += Unicode.TranslateToUTF8(CharSet, decoded[0], "ignore")
-          else:
-            value += Unicode.TranslateToUTF8(CharSet, decoded[0], "ignore")
+        except email.errors.HeaderParseError:
+          # just return the undecoded string if we can't decode it
+          value = MsgObj.msgobj["from"]
         From = value
         Temp = Address.search(From)
         if Temp:

_________________________________________________
tmda-workers mailing list ([email protected])
http://tmda.net/lists/listinfo/tmda-workers

Re: Patches needed to fix tmda-cgi tracebacks

Reply via email to