Re: [Bug 3821] scores are overoptimized for training set

Daniel Quinlan 1 Oct 2004 05:31:20 -0000

> RH/3 is simply my rule of thumb, because I generally deal with a limited 
> corpus of only 100k emails or so. IMO, if tested via corpora with enough 
> emails for testing, RH/2 wouldn't be unreasonable.


Sure, the perceptron does the same, but much better than humans (which
is why I generally avoid second guessing scores).  Henry is
experimenting with rule accuracy degradation over time and perhaps the
perceptron can handle this even better in the future.

-- 
Daniel Quinlan                     ApacheCon! 13-17 November (3 SpamAssassin
http://www.pathname.com/~quinlan/  http://www.apachecon.com/  sessions & more)

Re: [Bug 3821] scores are overoptimized for training set

Reply via email to