Re: question on training spamassassin

Jeff Portwine Mon, 27 Feb 2006 11:53:16 -0800

Hmm.. I don't quite understand this. At my company, we forward any spamthat gets through to [EMAIL PROTECTED] and any ham marked as spam to[EMAIL PROTECTED] ... this was set up long ago before I even startedworking here and the spam filter worked really well. Recently our bayesdatabase was broken and I ended up clearing it and retraining it with oldspam and ham. Since that time a lot of spams that were getting throughSTOPPED getting through after a couple of days of forwarding them to thespam address... and I haven't seen any false spams. So it seems like itdoes work for us, but you're saying it shouldn't ?

----- Original Message -----From: "Matt Kettler" <[EMAIL PROTECTED]>

To: <[EMAIL PROTECTED]>
Cc: <users@spamassassin.apache.org>
Sent: Monday, February 27, 2006 2:29 PM
Subject: Re: question on training spamassassin

Webmaster wrote:
A large number of our clients are using POP.
If I were to ask them to send false negatives to [EMAIL PROTECTED]
and false positives to [EMAIL PROTECTED] so I can place them in
a folder and train,  does that hinder the training process in
anyway knowing that the header info is changed with the
forwarding process.
Yes...Forwards are more-or-less completely unusable for training purposes.
However, you might be able to get "forward as attachment" to work, if yourmail
client supports it.


The problem with forwards is twofold.

First, the headers are completely destroyed. This is a major problem for
SpamAssassin's bayes engine, which studies headers.
Second, not only the header info is changed.. The body gets completelyredone.Mail clients typically add text to the top, and then re-encode the bodytext all
over.

If the orignal was base-64 encoded, the forward may not be.
If the original was multipart/alternative with text/plain and a text/html,theforward might drop the text/plain, and create a new one based on thecontent of
the text/html section.
As far as spam tools are concerned, these messages bear little resemblanceto
one another.

Re: question on training spamassassin

Reply via email to