On 12/30/2009 01:19 AM, Daryl C. W. O'Shea wrote:
Warren:
That would be ideal, but yes, the nightly masscheck is WAY too small. Even our
mcsnapshot was too small and required lots of manual massaging to output
scores that satisfied us.

Whoa, what.  Is there a diff available of the "required lots of manual
massaging"?  I must have missed that and that doesn't sound normal.  It
often starts (or talks about it start) and then there's usually a stats
smack down and things get more or less left alone.  Sometimes we fudge
really closely scored things that people think should be linear just so
we don't get a barrage of queries about it on the users' list, other
than that I don't recall "lots of manual messaging".  I'm scared.



https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155#c124
This attempt of GA scoring was after some manual cleaning of the rescore logs, explicitly excluding a large portion of the spam, etc. Even after that we were not happy with the scores.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6155#c146
Another attempt that jm was happier with. Subsequent comments have us manually adjusting these scores for various things including linearizing some rules like HTML_IMAGE_RATIO_* and overriding the scores of rules identified by Adam Katz's script (make them informational).

Warren

Reply via email to