Hi Mark, I once used this patch for japanese mailman: This re-generation was rejected by Barry because this may impose heavy load (?). This hack should simplify the charset gotcha just below the patched lines. Or, we may have to introduce a new variable to keep watch if the payload is decoded or not in email.Message.Message class. IMHO, mailing list messages should be in plain text without attachments and those who attach should pay (the load) for it.
--- Scrubber.py.orig Thu Dec 1 10:01:45 2005 +++ Scrubber.py Thu Dec 1 10:13:17 2005 @@ -28,6 +28,7 @@ from cStringIO import StringIO from types import IntType, StringType +from email import message_from_string from email.Utils import parsedate from email.Parser import HeaderParser from email.Generator import Generator @@ -313,6 +314,9 @@ Url : %(url)s """), lcset) outer = False + # Re-generation of message instance from stringfied one. + # This should normalize the payloads. + msg = message_from_string(msg.as_string()) # We still have to sanitize multipart messages to flat text because # Pipermail can't handle messages with list payloads. This is a kludge; # def (n) clever hack ;). Mark Sapiro wrote: > Mark Sapiro wrote: > >>I think the fix for the current problem is the following patch - >> >>--- mailman-2.1.6/Mailman/Handlers/Scrubber.py >>+++ mailman-mas/Mailman/Handlers/Scrubber.py >>@@ -376,9 +376,8 @@ >> # Now join the text and set the payload >> sep = _('-------------- next part --------------\n') >> del msg['content-type'] >>- msg.set_payload(sep.join(text), charset) >> del msg['content-transfer-encoding'] >>- msg.add_header('Content-Transfer-Encoding', '8bit') >>+ msg.set_payload(sep.join(text), charset) >> return msg > > > I still think this is the correct fix, but it turns out there are some > tricky issues here that I believe come down to an error in the > set_payload() method. > > Under certain circumstances, in particular when charset is 'iso-8859-1', > > msg.set_payload(text, charset) > > 'apparently' encodes the text as quoted-printable and adds a > > Content-Transfer-Encoding: quoted-printable > > header to msg. I say 'apparently' because if one prints msg or creates > a Generator instance and writes msg to a file, the message is > printed/written as a correct, quoted-printable encoded message, but > > text = msg._payload > or > > text = msg.get_payload() > > gives the original text, not quoted-printable encoded, and > > text = msg.get_payload(decode=1) > > gives a quoted-printable decoding of the original text which is munged > if the original text included '=' in some ways. > > This is a problem for Mailman because if Scrubber is processing > individual messages, the 'apparently' quoted-printable message gets > passed ultimately to SMTPDirect which calls Decorate, and Decorate > does msg.get_payload(decode=1) when adding the header and/or footer > and can mung the message in the process. > > There is also an issue with archiving when the archiver gets a > multipart message which is subsequently flattened by Scrubber. > > The following is a transcript of a Python interactive session that > illustrates the above problems with set_payload() and get_payload(). > This session is with Python 2.4.1, but exactly the same behavior > occurs with 2.3.4 and 2.4.2. > > Python 2.4.1 (#1, May 27 2005, 18:02:40) > [GCC 3.3.3 (cygwin special)] on cygwin > Type "help", "copyright", "credits" or "license" for more information. > >>>>import email >>>> >>>>msg = email.message_from_file(open('plain2.eml')) >>>> >>>>print msg > >>From nobody Mon Nov 28 09:18:41 2005 > From: "Mark Sapiro" <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: HTML - all > Date: Sun, 27 Nov 2005 09:02:33 -0800 > MIME-Version: 1.0 > Content-Type: text/plain; charset="iso-8859-1" > > > How about just a line of stuff with some ==== and a few words. > > X=91**2 (x is 91 squared) > > >>>>del msg['content-type'] >>>>del msg['content-transfer-encoding'] >>>>msg.set_payload(str(msg.get_payload()), 'iso-8859-1') >>>> >>>>print msg > >>From nobody Mon Nov 28 09:18:41 2005 > From: "Mark Sapiro" <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: HTML - all > Date: Sun, 27 Nov 2005 09:02:33 -0800 > MIME-Version: 1.0 > Content-Type: text/plain; charset="iso-8859-1" > Content-Transfer-Encoding: quoted-printable > > > How about just a line of stuff with some =3D=3D=3D=3D and a few words. > > X=3D91**2 (x is 91 squared) > > >>>>print msg.get_payload() > > > How about just a line of stuff with some ==== and a few words. > > X=91**2 (x is 91 squared) > > >>>>print msg.get_payload(decode=1) > > > How about just a line of stuff with some == and a few words. > > X`**2 (x is 91 squared) > -- Tokio Kikuchi, tkikuchi@ is.kochi-u.ac.jp http://weather.is.kochi-u.ac.jp/ _______________________________________________ Mailman-Developers mailing list Mailman-Developers@python.org http://mail.python.org/mailman/listinfo/mailman-developers Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-developers/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp