http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5376





------- Additional Comments From [EMAIL PROTECTED]  2007-07-03 08:43 -------
(In reply to comment #3)
> I think our current "generate scores once a millenia" method is not really
> functional for several reasons, but mainly that it takes too long between
> updates and the scores are overly tuned for our messages and not necessarily
> what other people see (despite having 42 rsync accounts, we have few (8 (but
> really more like 6.5) at current count) people doing nightly runs, and I'm 
> ~50%
> of the total results).
> 
> Also, the last thing I saw about the perceptron, and why it didn't work for 
> 3.2,
> was that it needed more diverse data to score with.  Perhaps we just need 
> better
> rules?

well, the GA did pretty well, so we came to the conclusion that the data was ok
-- just not the kind of data the perceptron's gradient-descent algorithm did
well with.

> Anyway, coming up with a new scoring algorithm isn't bad, I just don't think
> it's going to make a lot of difference without other changes as well, such as
> more frequent runs, more corpus runs/results from the masses, and/or a way for
> individual locations to easily update their own scores.

sure.  The benefit of this, however, is that we can define the problem and then
*other people* will do it for us ;)

Michael, good point.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to