On 10/25, Bowie Bailey wrote:
> On 10/25/2012 10:47 AM, Simon Loewenthal wrote:
> >*  2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)'
> >
> >Does anyone know the rational behind this, or is our user base simply 
> >communicating on a higher level?  :)  I imagine the rational is sound, but I 
> >do not know what it is.
> 
> The rationale is simple.  The masscheck finds that this rule hits
> more spam than ham, so it gets a higher score.

It's slightly more complicated than that.  It's that this score results in
the maximum spams flagged as spam without exceeding 1 false positive in
2,500 non-spams.

A fun example is SUBJ_YOUR_DEBT, which was getting a score of 3.0 while
hitting more non-spam than spam.  I guess it got disabled somehow.


But more importantly, it's because we do not have have the rule
hit statistics from your email to include them in optimal score
generation because you're not submitting those stats via masscheck:
https://wiki.apache.org/spamassassin/NightlyMassCheck


RuleQA results for that rule are here:
ruleqa.spamassassin.org/?daterev=20121020&rule=DEAR_SOMETHING

  MSECS    SPAM%     HAM%     S/O    RANK   SCORE  NAME   WHO/AGE
      0   0.6160   0.2324   0.726    0.63    2.00  DEAR_SOMETHING  

It hits 0.6% of spam, and 0.2% of non-spam (ham).


On 10/25, Alexandre Boyer wrote:
> Simon, I had some FPs because of this rule and because my threshold is
> lower than 5.

If you could just append "and I know this is highly discouraged"
any time you say that, you might reduce my need to point it out to
avoid you causing other people to think that might be a good idea.
Scores are generated with a threshold of 5.  It's often recommended to
use a threshold above 5 for an extra safety measure.  Do you even have a
guess what rate of false positives your causing with a lower threshold?
I don't.

> I just had a score override to lower it but this rule still hist a lot
> of spam (419 scams essentially).

Yup, nothing wrong with customizing your rules to suit the email you get
better.  At least in the direction of reducing false positives.  

-- 
"I finally figured out the only reason to be alive is to enjoy it."
- Rita Mae Brown
http://www.ChaosReigns.com

Reply via email to