Re: Tag Level for spam

Matt Kettler Tue, 15 May 2007 17:52:41 -0700

Martin Hochreiter wrote:
> Hi!
>
> Is there something like a recommended tag level when to treat a mail
> as spam?
>
> (I actually use 1.7 as tag level for amavis/spamassassin)
>


5.0 is the "recommended" default. This level will tune SA to treat false
positives (nonspam tagged as spam) as roughly 100 times worse than false
negatives (spam that isn't tagged).

Lowering the threshold will reduce the false negatives, thus catching
more spam, but will also increase your false positive rate.

If you look at the STATISTICS*.txt files, you can see what kind of
effects lowering the threshold should have on these numbers.

For example, set3 (bayes and network tests enabled) on SA 3.2:

http://svn.apache.org/repos/asf/spamassassin/branches/3.2/rules/STATISTICS-set3.txt

Shows these numbers for 5.0:

# SUMMARY for threshold 5.0:
# Correctly non-spam:  67508  99.94%
# Correctly spam:     117303  98.51%
# False positives:        42  0.06%
# False negatives:      1780  1.49%

But these for 2.0:

# SUMMARY for threshold 2.0:
# Correctly non-spam:  66745  98.81%
# Correctly spam:     118903  99.85%
# False positives:       805  1.19%
# False negatives:       180  0.15%


Note that at 2.0, the number of missed spams has gone down by a factor
of almost 10, from 1780 to 180. However, the number of false positives
has increased by a factor of more than 19, from 42 to 805.

Your exact results might be a little better, or rarely a little worse,
depending on your use of whitelists, how aggressively you train bayes,
what add-on rules you have, etc. However, these results should be
typical for a "stock" config with no use of manual whitelists, no AWL,
and relatively light bayes training.

Re: Tag Level for spam

Reply via email to