http://bugzilla.spamassassin.org/show_bug.cgi?id=4136
------- Additional Comments From [EMAIL PROTECTED] 2005-02-14 14:49 ------- Subject: Re: mass-check --reuse should disable reused rules during mass-check --net > - we're only going to need this when testing out Bayes, and the time > consuming piece is the net checks, so only bother reusing net rules. Yes, the code only reuses network rules (that have been present for some time and have not changed much; allowing some small changes is probably still better than using delayed network testing). > - we want to cause certain rules not to run, but we want the hit to > show up and the score to be added as if it had been run. the problem > is that the score will only be added if the rule actually runs, and > the rule will run if the score is non-zero. at the time, it looked > like a whole lot of hackery would be needed to get this working. The total score doesn't matter at all in mass-check or in the perceptron, actually (it's rounded for Pete's sake). The score is *only* used for the FP/FN options to hit-frequencies which are semi-broken and rarely, if ever used. Now, it matters during the perceptron run itself, but that's all computed internally. I talked about this with Henry last night (this morning) and we even came to agreement that --reuse was safe to always run during nightly mass-checks that aren't using --net. There's no reason to not show network hits if you have them. > - the main problem, from my pov, with the reuse approach is that what > you really want is to know what requests were made and what the answers > were at a given point in time. reuse attempts to deduce what that > information is based on rule hit. ie: it's good to know whether or > not spamcop/surbl/etc hit on a given rule (means it ran and hit), > but what's more important is to know that it didn't hit but was run. > If it didn't run at all, we need to run it ala our current method. Agreed. For the future, why don't we add an X-Spam-Status flag similar to autolearn? net=yes and net=no -- for now, we'll just have to ask people to only use --reuse for the sections of the corpus that have been tagged with network checks on. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
