------------------------------------------------------------
revno: 1188
committer: Mark Sapiro <msap...@value.net>
branch nick: 2.2
timestamp: Thu 2011-10-13 21:09:02 -0700
message:
  The fix for BUG #266220 (sf1181161) has been enhanced so that if there
  is a pathological HTML part such that the Approved: password text isn't
  found, but it is found after stripping out HTML tags, the post is
  rejected with an informative message.
modified:
  Mailman/Handlers/Approve.py
  NEWS


--
lp:mailman/2.2
https://code.launchpad.net/~mailman-coders/mailman/2.2

Your team Mailman Checkins is subscribed to branch lp:mailman/2.2.
To unsubscribe from this branch go to 
https://code.launchpad.net/~mailman-coders/mailman/2.2/+edit-subscription
=== modified file 'Mailman/Handlers/Approve.py'
--- Mailman/Handlers/Approve.py	2011-04-25 23:51:10 +0000
+++ Mailman/Handlers/Approve.py	2011-10-14 04:09:02 +0000
@@ -39,6 +39,16 @@
 
 NL = '\n'
 
+def _(s):
+    # message is translated when used.
+    return s
+REJECT = _("""Message rejected.
+It appears that this message contains an HTML part with the
+Approved: password line, but due to the way it is coded in the
+HTML it can't be safely removed.
+""")
+del _
+
 
 
 def process(mlist, msg, msgdata):
@@ -100,7 +110,8 @@
             # text part.  We make a pattern from the Approved line and delete
             # it from all text/* parts in which we find it.  It would be
             # better to just iterate forward, but email compatability for pre
-            # Python 2.2 returns a list, not a true iterator.
+            # Python 2.2 returns a list, not a true iterator.  Also, there
+            # are pathological MUAs that put the HTML part first.
             #
             # This will process all the multipart/alternative parts in the
             # message as well as all other text parts.  We shouldn't find the
@@ -111,12 +122,18 @@
             # line of HTML or other fancy text may include additional message
             # text.  This pattern works with HTML.  It may not work with rtf
             # or whatever else is possible.
+            #
+            # If we don't find the pattern in the decoded part, but we do
+            # find it after stripping HTML tags, we don't know how to remove
+            # it, so we just reject the post.
             pattern = name + ':(\xA0|\s|&nbsp;)*' + re.escape(passwd)
             for part in typed_subpart_iterator(msg, 'text'):
                 if part is not None and part.get_payload() is not None:
                     lines = part.get_payload(decode=True)
                     if re.search(pattern, lines):
                         reset_payload(part, re.sub(pattern, '', lines))
+                    elif re.search(pattern, re.sub('(?s)<.*?>', '', lines)):
+                        raise Errors.RejectMessage, REJECT
     if passwd is not missing and mlist.Authenticate((mm_cfg.AuthListPoster,
                                                      mm_cfg.AuthListModerator,
                                                      mm_cfg.AuthListAdmin),

=== modified file 'NEWS'
--- NEWS	2011-10-04 21:52:22 +0000
+++ NEWS	2011-10-14 04:09:02 +0000
@@ -127,6 +127,11 @@
 
   Bug Fixes and other patches
 
+    - The fix for BUG #266220 (sf1181161) has been enhanced so that if there
+      is a pathological HTML part such that the Approved: password text isn't
+      found, but it is found after stripping out HTML tags, the post is
+      rejected with an informative message.
+
     - A bug that would cause reset of any new_member_options bits other than
       the four displayed as checkboxes on the list admin General Options page
       whenever the page was updated or bin/config_list attempted to update

_______________________________________________
Mailman-checkins mailing list
Mailman-checkins@python.org
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-checkins/archive%40jab.org

Reply via email to