--[ UxBoD ]-- wrote:
Hi,

I just had this message get through :-

<snip>
and it only scored 5.6.   These are the rules it hit :-

1.23 ADVANCE_FEE_2 0.00 BAYES_50 0.72 SARE_URGBIZ Contains urgent matter -0.00 SPF_PASS 2.08 SUBJ_ALL_CAPS 1.58 URG_BIZ
Looks like you might want to do some bayes training on that message. All the capitalized text should be an easy target.
I have my SA SPAM score to trigger on 6 and above.  Do you think that is to 
high ? or anyone know of a ruleset to raise the score on these ?

Too high? no. Too high to expect there to be no missed spam, yes.

Raising your threshold reduces false positives (nonspam tagged as spam), but it also increases your false negatives (spam that's missed). Lowering your score threshold has the opposite effect.

When picking a threshold, you're making a trade-off.. Pick one based on what's important to you. Some folks run as high as 8.0, and others as low as 2.0. Both numbers are pretty extreme, but you get the idea.

For reference, in the set3 mass-checks, going from 5.0 to 6.0 more halved the FPs (down to 45% of what they were at 5.0), but also increased FNs by 78%.

The default 5.0 score is already pretty biased towards favoring FPs over FN's. The score assigner tries to tune the scores so at 5.0 there's roughly 100 times more FNs than FPs, while keeping both as low as possible. In practice it's more like 50 times more, but that's what it's trying for..

to quote STATISTICS-set3.txt from SA 3.2.4:

# SUMMARY for threshold 5.0:
# Correctly non-spam:  67508  99.94%
# Correctly spam:     117303  98.51%
# False positives:        42  0.06%
# False negatives:      1780  1.49%




Reply via email to