http://bugzilla.spamassassin.org/show_bug.cgi?id=3821
------- Additional Comments From [EMAIL PROTECTED] 2004-09-27 14:21 ------- 'BTW, this is the "rule reliability tflag" idea again; basically provide a way to hint that this rule is reliable, and this rule should not be considered reliable -- no matter what their hit-rates in mass-checks were.' oh, I should point out -- the point in particular here is that, often, you can get rules that hit 20%:0.001% spam:ham for a very high S/O -- they would always be given a good high range, and the perceptron allowed to range those rules highly. However, sometimes a really simple one-word body pattern (f.e. "viagra") may get 1.0%:0.0001% hit-rates. Given that it's a really simple one-word body pattern, *we* know that it has a high chance of FP'ing in the field, even if our corpora don't use it at all -- so a reliability tflag gives us a way to indicate this. OTOH, at times, we know that another similarly low-frequency rule is very reliable and won't FP, and so can safely get a high score, but we just don't have a lot of data that hits it in our corpora. The current problem is that our scoring code has to be over-paranoid about ranges for low-frequency rules -- just in case it's the first case and not the second -- hence restricting them unfairly. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
