--[ UxBoD ]-- wrote:
Hi,
I just had this message get through :-
<snip>
and it only scored 5.6. These are the rules it hit :-
1.23 ADVANCE_FEE_2
0.00 BAYES_50
0.72 SARE_URGBIZ Contains urgent matter
-0.00 SPF_PASS
2.08 SUBJ_ALL_CAPS
1.58 URG_BIZ
Looks like you might want to do some bayes training on that message. All
the capitalized text should be an easy target.
I have my SA SPAM score to trigger on 6 and above. Do you think that is to
high ? or anyone know of a ruleset to raise the score on these ?
Too high? no. Too high to expect there to be no missed spam, yes.
Raising your threshold reduces false positives (nonspam tagged as spam),
but it also increases your false negatives (spam that's missed).
Lowering your score threshold has the opposite effect.
When picking a threshold, you're making a trade-off.. Pick one based on
what's important to you. Some folks run as high as 8.0, and others as
low as 2.0. Both numbers are pretty extreme, but you get the idea.
For reference, in the set3 mass-checks, going from 5.0 to 6.0 more
halved the FPs (down to 45% of what they were at 5.0), but also
increased FNs by 78%.
The default 5.0 score is already pretty biased towards favoring FPs over
FN's. The score assigner tries to tune the scores so at 5.0 there's
roughly 100 times more FNs than FPs, while keeping both as low as
possible. In practice it's more like 50 times more, but that's what it's
trying for..
to quote STATISTICS-set3.txt from SA 3.2.4:
# SUMMARY for threshold 5.0:
# Correctly non-spam: 67508 99.94%
# Correctly spam: 117303 98.51%
# False positives: 42 0.06%
# False negatives: 1780 1.49%