RE: Improving sa

Bowie Bailey Mon, 28 Nov 2005 09:07:29 -0800

From: Andy Pieters [mailto:[EMAIL PROTECTED]
> 
> Second, it seems that spamassassin vs spam is nothing less then an
> arms-race, with spamassassin perpetually running behind.


Yes, and the same is true of anti-virus programs.

> 
> As more and more rules are added, doesn't it come to a point where
> deciding if a message is spam or ham takes longer and longer or up
> to a point where spamassassin allone can't handle it anymore?

Probably not.  SA is constantly being improved and we are finding
better and more efficient ways to detect spam.  Also, CPU speed is
constantly going up.  So I don't think we'll get to that point anytime
soon.

> Lastly, I am running spamassassin 3.1 out of the box, that is
> installed the rpm and that's it.
> 
> What can I do to increase effectiveness of spamassassin in
> diffrentiating spam from ham?  Right now, there's about 10% of all
> messages that come in on a day (4.500) that are injustly marked as
> ham or spam (10% is not a lot, but still 45 messages each day!)

10% is quite a lot.  That rate would be completely unacceptable in
many places.  A properly configured SpamAssassin installation should
have a failure rate closer to 1% (or lower).

I would suggest a few things.

First, make sure that trusted_networks and internal_networks are set
properly.  If they are empty, SA will make a guess at what they should
be.  Even if this seems to be working, it is always better to set them
manually.  These settings affect quite a bit of the internal workings
of SpamAssassin.

Second, make sure your network tests are working.  For best results,
enable Razor and DCC.  Network tests will frequently catch spam even
when none of the rules do.

Third, install some of the SARE rulesets from www.rulesemporium.com.
I use almost all of their rules both on my home and business systems.
On my business system, I only use the 0-level versions (hits only
spam).  On my home system, I use the 1-level versions as well (hits a
few hams).

Fourth, if you are using Bayes, make sure you train it with all of the
messages that are scored wrong.  A well-trained Bayes database can be
a BIG help, but a mis-trained Bayes db will cause problems.

I only see two or three spams a day make it through on the accounts I
monitor.  The main false positives I see are mails from this list with
spammy text or mails with program code that are caught by the
chickenpox rules.

Bowie

RE: Improving sa

Reply via email to