On Mon, 2009-05-04 at 22:09 +0100, RW wrote:
> There are two separate tests, the autolearn result must be consistent
> with the overall classification, and not inconsistent with the bayes
> scoring.

I stand corrected with egg on my face.

Yes, you are perfectly right. Spent a while digging through the code,
understood the issue and got back here for a follow-up -- just to find
you beat me to it. :)

> From AutoLearnThreshold.pm:
> 
>   my $learner_said_ham_points = -1.0;
>   my $learner_said_spam_points = 1.0;
> 
>   if ($isspam) {
[...]
>   } else {

The relevant part is the !$isspam else clause, though. More precisely:

    if ($learned_points > $learner_said_spam_points) {

This returns without auto-learning if $learned_points > 1.0 (hardcoded),
which is *exactly* what you said. Sorry.

FWIW, the hard part was to find out what exactly $learned_points is. The
docs for the M::SA::PerMsgStatus functions haven't been helpful at all,
though the _get_autolearn_points() code shows that $learned_points
simply translates to the scores' sum of all hit "tflags learn" rules.

  guenther


-- 
char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to