From: Andy Pieters [mailto:[EMAIL PROTECTED] > > Second, it seems that spamassassin vs spam is nothing less then an > arms-race, with spamassassin perpetually running behind.
Yes, and the same is true of anti-virus programs. > > As more and more rules are added, doesn't it come to a point where > deciding if a message is spam or ham takes longer and longer or up > to a point where spamassassin allone can't handle it anymore? Probably not. SA is constantly being improved and we are finding better and more efficient ways to detect spam. Also, CPU speed is constantly going up. So I don't think we'll get to that point anytime soon. > Lastly, I am running spamassassin 3.1 out of the box, that is > installed the rpm and that's it. > > What can I do to increase effectiveness of spamassassin in > diffrentiating spam from ham? Right now, there's about 10% of all > messages that come in on a day (4.500) that are injustly marked as > ham or spam (10% is not a lot, but still 45 messages each day!) 10% is quite a lot. That rate would be completely unacceptable in many places. A properly configured SpamAssassin installation should have a failure rate closer to 1% (or lower). I would suggest a few things. First, make sure that trusted_networks and internal_networks are set properly. If they are empty, SA will make a guess at what they should be. Even if this seems to be working, it is always better to set them manually. These settings affect quite a bit of the internal workings of SpamAssassin. Second, make sure your network tests are working. For best results, enable Razor and DCC. Network tests will frequently catch spam even when none of the rules do. Third, install some of the SARE rulesets from www.rulesemporium.com. I use almost all of their rules both on my home and business systems. On my business system, I only use the 0-level versions (hits only spam). On my home system, I use the 1-level versions as well (hits a few hams). Fourth, if you are using Bayes, make sure you train it with all of the messages that are scored wrong. A well-trained Bayes database can be a BIG help, but a mis-trained Bayes db will cause problems. I only see two or three spams a day make it through on the accounts I monitor. The main false positives I see are mails from this list with spammy text or mails with program code that are caught by the chickenpox rules. Bowie