On Thu, Feb 19, 2004 at 03:44:05PM -0800, jdow wrote:
> From: "Jonathan Tai" <[EMAIL PROTECTED]>
> On Thu, 2004-02-19 at 14:36, jdow wrote:
>
> I saw that one. I'm trying to minimize my admin time by automating it
> if possible. It seems like this should be possible. (Push comes to
> shove I toss together a C program to strip the excess headers.)
>
> "If it comes from jdow to jdowspam on the internal net and has a
> forwarded email in it then strip off the forwarding cruft and feed
> it through sa-learn rather than spamc. Then drop it on the floor."
>
> I'm just hoping that someone's been crazy enough to do something like
> this already. In principle it should not be particularly hard to strip
> off a level of mime before farming it out.
>
> {^_^}
The problem is that outlook only forwards some of the headers and most
of them it changes. Date: is changed to Sent: and the contents of From:
are mangled. So far the only header I've seen that remains the same
when forwarded is the Subject:. As I understand it, you really need all
of the original, unmodified headers to train via sa-learn since it
tokenizes the headers as well as the body.
I'm using spamassasssin via mailscanner. I haven't come up with a
totally automated solution, yet. (and I'm not sure I can, really.)
But,mailscanner has an option to archive all mail in it's original
pristine state. I have users forwarding false-negatives to a mailbox.
>From that I'm grepping out the forwarded Subject: lines and piping them
via xargs to a grepmail on the archive. grepmail spits all matching
messages into an mbox which I then weed through with mutt to delete any
that aren't spam (so far I haven't seen any, but it is definitely
possible). Finally, I feed sa-learn w/that mbox.
I actually end up feeding more than the reported number of false
positives, since often many people received spams with the same subject
but, either aren't having their mail spam-scanned, yet, or didn't bother
to report the false negative.
This actually kind of sucks, but it's the best I've come up with so far.
I've run into some tricky issues with subjects that contain special
regex characters like (,),!,?,.,etc. I'm not really sure where I'm going
to go from here with this, but maybe it gives you some ideas ...
-Eric Rz.