On Mon, Dec 12, 2005 at 09:48:24AM -0800, Bob Miller wrote:
> > So the question is, how do I match on the condition of that two-line
> > breakage and, having done so, how do I fix the broken charset before
> > writing the message to the destination folder?
> >
> > Content-type: text/plain; charset="US-ASCII"
> > Content-Transfer-Encoding: quoted-printable
>
> Use the "f" flag to filter your mail through an external program
> that handles multiline regexps.
Ahh, so we will forward some mail that don't need fixing along with some
that do, but we'll trust the external program to be smarter. At my mail
volume, this is reasonable. (I get a few hundred per day, that's all, 60%
of that is junked automatically by spam rules..)
> fix_aol_charset.py looks like this.
>
> #!/usr/bin/python
>
> import re, sys
>
> pattern = r'(content-type:\s*text/plain;\scharset=")us-ascii("\n'
> pattern += r'content-transfer-encoding: quoted-printable)'
> pattern = re.compile(pattern, re.IGNORECASE | re.MULTILINE)
> fix = lambda msg: pattern.sub(r'\1windows-1252\2', msg)
>
> sys.stdout.write(fix(sys.stdin.read()))
I am not familiar with r'' in python, since most of what I do has to work
as far back as 1.5 or 2.0.. What's the deal with that? =) I suppose it
would also be possible here to fix the email using some Python MIME
library to make it more robust, but I imagine I don't really have a need
to do that.
> The three condition lines in .procmailrc are an optimization to invoke
> the external program only on messages that *probably* have the charset
> problem. Feel free to give them more forgiving regexps -- I just
> copied the text from your mail.
No, the regex used above is fine I think, though I might tighten up the
AOL check a little.
--
"We are what we repeatedly do. Excellence, therefore, is not an act,
but a habit."
-- Aristotle
_______________________________________________
EUGLUG mailing list
[email protected]
http://www.euglug.org/mailman/listinfo/euglug