https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6828
Kevin A. McGrail <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] Target Milestone|Undefined |3.4.0 Summary|Adjust default autolearn |Adjust default autolearn |ham threshold to reduce |settings to reduce Bayesian |mistraining under default |mistraining under default |configuration |configuration --- Comment #10 from Kevin A. McGrail <[email protected]> --- It does seems that lowering the threshold for learning as ham makes sense to try and avoid any FNs slipping through based on anecdotal complaints. I think this is also being extrapolated to a spam threshold change as well. Anyone have suggestions on a testing protocol that might help decide the defaults? If I am thinking correctly, if we used masscheck data, the scoring is designed not to mark spam as ham and ham as spam. So the minimum threshold should be the spam threshold. This means that 12.0 is chosen at random as an experienced guess for a number inflated for real-world safety. Going further, my system is configured for 6.0 instead of 5.0 with a lot of single-fire rules and things that focus on scoring ham. So it doesn't make it a good source of project-wide data concerning auto-learning thresholds. In fact, I'm wondering a bit if a default setup can score below a zero very often and if we are now going to skew bayes towards only certain classifications of ham. And in the end, none of our tweaked system data and configuration are relevant to this discussion. Looking at the thresholds, we really need a scientific approach based on the DEFAULT configurations to continue this discussion. bayes_auto_learn_threshold_nonspam n.nn (default: 0.1) bayes_auto_learn_threshold_spam n.nn (default: 12.0) And, in the end, I wonder also if we are missing turning on bayes_auto_learn_on_error as a default. I think for 3.4.0 turning this setting on and losing the backwards compatibility makes sense. Regards, KAM -- You are receiving this mail because: You are the assignee for the bug.
