Re: SARE false positives

Loren Wilton 22 Jun 2004 19:26:38 -0000

> Anyway, below header shows valid Japanese mail with ISO-2022-JP encoded
> text that triggered several SARE header rules from 70_sare_genl_subj0.cf:-
>
> SARE_SUB_CASH_CHAR
> SARE_SUB_RAND_LETTRS2
> SARE_SUB_RAND_LETTRS5


> SARE_SUB_RAND_LETTRS2

You need to update.  I moved LETTERS2 to the -1 file from -0 a couple of
days ago because it was getting too many ham hits.


> X-Spam-Status: No, hits=-1.7 required=6.8
tests=AWL,BAYES_00,J_BACKHAIR_31,

We may want to move some of the other rules you mention to -1 from -0 also.

But keep in mind the difference between the -0 and -1 files: -0 is supposed
to be rules that don't (to our knowledge, subject to revision) hit non-spam.
The -1 rules are rules that we KNOW will occasionally hit non-spam, but also
hit way more spam than they do ham.

Which is why many of the rules have relatively low scores.  It is quite
reasonable to have rules that will hit the occasional non-spam phrase.  As
long as not too many rules hit, it is not a problem.  As witessed by the
mail you cited: it got -1.7 points (probably largely from the Bayes_00 hit),
and that is way short of the 6.8 points required to be spam.  Even without
bayes it doesn't look like it would have triggered.

The moral is, you should expect to see rule hits on ham.  You just shouldn't
expect to see enough rule hits to trip it over the edge into being spam,
unless it is spam.

        Loren

Re: SARE false positives

Reply via email to