On 07/24, Stephan wrote: > I have been setting up a home mail server recently and it seems that I > cannot get all spam trapped correctly. Example is below for instance:
"All spam"? You may have unrealistic expectations. Although I certainly encourage you to try to do better than what anybody else has managed. Seriously, that's the only way we get better at this. For example, in the ideal case where the email you get exactly matches the email that spamassassin was trained on, in the STATISTICS-set3.txt.gz (network and bayes tests enabled) file included with spamassassin it says: # False positives: 8 0.04% # False negatives: 691 1.57% 1.57% of spam missed. > http://pastebin.com/EBER8iuP > So my question is, what should I do basically to increase the accuracy of > this detection ? Should I change my thresholds ? Manually create a > blacklist ? Add some custum rulesets (I recently added Khopesh's one) It might be useful to tell us exactly what scores you're getting for each test you're hitting, by using "spamassassin -t". Do not lower your threshold below 5. All scores are generated assuming a threshold of 5 with a target of 1 in 2,500 false positives. Lowering your threshold will increase your false positives. Sought is the only other rule set I'd recommend: http://wiki.apache.org/spamassassin/SoughtRules Do you have Pyzor and Razor installed? You could increase the score of BAYES_99 if you trust it. You should check the scores on all your non-spam that hits BAYES_99 and see how much of them would become flagged as spam if you increase that score. I wouldn't recommend that without disabling auto-training bayes ("bayes_auto_learn 0") because that can go wrong (auto-training spam as non-spam and reverse). And keep in mind, if you only have, say, 100 non-spams to base your score change on, you risk increasing your false positives from ~1 in 2,500 to ~1 in 101 or worse. If this is a repeated problem, it might be useful to try coming up with your own custom rule or two. And if they help, please share with this mailing list. http://wiki.apache.org/spamassassin/WritingRules Another possibility is to participate in the nightly mass checks - submitting your rule hit stats (not emails) to the process which calculates spamassassin scores: http://wiki.apache.org/spamassassin/NightlyMassCheck We always need more of that to increase everybody's accuracy, and of course it'll increase your accuracy more than those who don't participate. I've started a combination IP white + blacklist, which you're welcome to contribute to: http://www.chaosreigns.com/iprep/ I'm kind of excited about it, but it needs more contributors to really be useful for non-contributors. -- "Let's just say that if complete and utter chaos was lightning, then he'd be the sort to stand on a hilltop in a thunderstorm wearing wet copper armour and shouting 'All gods are bastards'." - The Color of Magic http://www.ChaosReigns.com