On 06/07/11 09:17, Lars Jørgensen wrote:
I think many people run with tag at 5.0 and discard at 10.0

I should have mentioned that we are running amavisd-new. I thought that was the 
de facto way of integrating spamassassin into a mail gateway, but reading this 
list reveals that most people probably doesn't do that. Makes me wonder if I am 
doing the wrong thing?

Amavisd-new has further settings as to thresholds, and these are the ones I put 
in as of today (after reading other peoples tips here, thank you everybody):

$sa_tag_level_deflt  = -10;  # add spam info headers if at, or above that level
$sa_tag2_level_deflt = 5.2;  # add 'spam detected' headers at that level
$sa_kill_level_deflt = 6.2;  # triggers spam evasive actions (e.g. blocks mail)
$sa_dsn_cutoff_level = 7.4;  # spam level beyond which a DSN is not sent

Does above scores make sense?



Yes, makes perfect sense to other amavisd-new users. I currently tag at 5.0 (the default SA score) and quarantine at 6.0. I also set the DSN cut-off level to be the same as quarantine as I don't want to send DSNs.

If you are finding spam is getting through untagged with the default SA score of 5.0 then I would look to write some additional rules to target those spam that are getting through rather than lowering the score below the SA default of 5.0. This list can help you with that if you provide examples.

Additionally, I have very carefully hand trained bayes with only confirmed spam/ham and tweaked the scores to be more representative of the faith I have in my bayes data. I find many cases where bayes alone will identify spam and have scored bayes_99 accordingly.

The main "problem" I see with SA is that I reject all the easy spam (>90%) at the smtp level so SA only really gets to see the more difficult and less obvious stuff. If SA saw all spam then the detection rates out of the box would be extremely high, but with only the more difficult samples to chew on detection rates inevitably drop and are artificially lowered. As a result it can appear that a lot of spam is getting through when in reality the overall percentage is still really small. That last 1% is just hard to catch without increasing the risk of false positives.


Reply via email to