Hmm.. I don't quite understand this. At my company, we forward any spam
that gets through to [EMAIL PROTECTED] and any ham marked as spam to
[EMAIL PROTECTED] ... this was set up long ago before I even started
working here and the spam filter worked really well. Recently our bayes
database was broken and I ended up clearing it and retraining it with old
spam and ham. Since that time a lot of spams that were getting through
STOPPED getting through after a couple of days of forwarding them to the
spam address... and I haven't seen any false spams. So it seems like it
does work for us, but you're saying it shouldn't ?
----- Original Message -----
From: "Matt Kettler" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: <users@spamassassin.apache.org>
Sent: Monday, February 27, 2006 2:29 PM
Subject: Re: question on training spamassassin
Webmaster wrote:
A large number of our clients are using POP.
If I were to ask them to send false negatives to [EMAIL PROTECTED]
and false positives to [EMAIL PROTECTED] so I can place them in
a folder and train, does that hinder the training process in
anyway knowing that the header info is changed with the
forwarding process.
Yes...Forwards are more-or-less completely unusable for training purposes.
However, you might be able to get "forward as attachment" to work, if your
mail
client supports it.
The problem with forwards is twofold.
First, the headers are completely destroyed. This is a major problem for
SpamAssassin's bayes engine, which studies headers.
Second, not only the header info is changed.. The body gets completely
redone.
Mail clients typically add text to the top, and then re-encode the body
text all
over.
If the orignal was base-64 encoded, the forward may not be.
If the original was multipart/alternative with text/plain and a text/html,
the
forward might drop the text/plain, and create a new one based on the
content of
the text/html section.
As far as spam tools are concerned, these messages bear little resemblance
to
one another.