SpamAssassin's Bayesian filtering sucks.

Ann Barcomb Wed, 04 Apr 2007 18:10:02 +0200 (CEST)

First I was using 'sa-learn' to mark everything in the suspected-spam folder
as spam.  Then I deleted the spam, and ran 'sa-learn' again with --ham to
reset the values for the good messages.  According to the manual pages,
the most recent setting assigned to a message is the one which is used.
I set my configuration to give Bayesian reports a lot of weight.


I was still getting about 10 false positives a day and about 10 false
negatives.  A friend told me that he'd been using it in a similar way,
and that when he threw away his database, and started over, and stopped
even temporarily marking letters incorrectly, his results improved.

So I tried that.  I threw away the database, and for the last two
months, I've told it nothing but the truth.  It still refuses to accept
that mail from one mailing list (not this one) isn't spam, despite the
fact that it has a very distinctive subject line (the from header changes,
so I cannot whitelist it).  So distinctive, in fact, that I finally wrote
a procmail rule to move it away before SpamAssassin could touch it.

Of course, what I really hate are the people who buy products from
spammers and make the whole cycle of spamming continue.  Someone [*]
should sell them cyalis arsenic.

People.

But that's another list, which is a shame, because I could spend quite
a while bitching about my bank's internet banking.

- A

[*] Not me.

SpamAssassin's Bayesian filtering sucks.

Reply via email to