http://bugzilla.spamassassin.org/show_bug.cgi?id=3847





------- Additional Comments From [EMAIL PROTECTED]  2004-09-30 19:41 -------
Subject: Re:  Consider removing RFCI tests from SA 3.0

> The only reason for a "reliability" specifier is to give the perceptron a
hint
> when the actual data is in the noise level -- E.g., if there are no ham
hits in
> the corpus.

Um.  I don't think so.  That is, I don't think that is the *only* reason.
The other and perhaps more major reason is because one might suspect that
the corpus results differed from 'universal' results, and that humans knew
that in the universal case some rules could be safer than others.

Case in point is the BAYES_99 rule that is getting numerous complaints about
the current low score.  Here I don't think the problem can be described as
'noise level' or 'no ham hits'.  It is more of a case of "ok, perceptron -
these here scores remain cannonical, and this one remains no lower than X,
and you do whatever is needed to make the other scores fit, whether you want
to or not."


> A philosophical question: If RFCI was perfect at hitting otherwise missed
spam
> with no FPs except roaringpenguin.com, and the mail used to score the
perceptron
> had much less than 1 in 2500 mails from roaringpenguin.com, is it correct
to let
> the rule get a very high score?

An interesting question.  :-)

Whatever the philosophical answer is to that question, I would suggest that
an appropriately designed 'reliability' crieterion could be used to bias or
limit (or both) the score, if the philosophical decision was that such were
deemed necessary.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to