https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5861
--- Comment #17 from Justin Mason <[email protected]> 2009-04-01 01:44:55 PST --- (In reply to comment #14) > I don't see how it's relevant, but no. It's from some US uni. > > The point is that there probably should be some limit on how many tokens to > get > from a header. If I learn that as spam, all ham mail containing those headers > will be strongly biased to spam (an uneducated, but logical guess). I think you're overestimating it's effects on the chi-square probability combining algorithm; actually, there's a good chance those values won't skew it much, assuming there are stronger tokens found elsewhere. The only way to get a useful idea of what's really happening is to run a 10-fold cross validation run. http://wiki.apache.org/spamassassin/TenFoldCrossValidation -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
