http://bugzilla.spamassassin.org/show_bug.cgi?id=4349





------- Additional Comments From [EMAIL PROTECTED]  2005-06-04 00:36 -------
Theo, since my attention has been drawn back to this, 
> we'd have to detect which rules are set to do this before the run
yes, perhaps with "tflags rule_name count" indicating this rule requires 
counting. 
However, the syntax of "score rule_name value * factor" or "... ** exponent"
might be sufficient. The trick is to evaluate these in advance of testing for
match(es), rather than after. 

> this will decrease performance, possibly significantly, even for general use
When used, yes, these can be expensive. How expensive would it be, for general
use (minimal use of these rules, or no hits for these rules), if we used logic
like: 

1) run the test. If it matches, track /where/ it matches (index into the range).
The efficiency question here for the general case is whether storing that index
adds significant cost to the process.

2) is it an arithmetic rule?  If not, set the binary and exit. 

3) If it is, count the hit and rerun the test from this index point. Increment
for each count and exit when no more matches.

It'd obviously be most expensive when there are many hits, so maybe a method of
limiting the loop should apply, maybe something like: 
score   rule_name  0.2*n 
tflags  rule_name  limit:5

That changes the logic to: 

1) run the test. If it matches, track /where/ it matches (index into the 
range). 

2) is it an arithmetic rule?  If not, set the binary and exit. If yes, increment
the counter.

3) Is there a limit? (Perhaps a limit should be required?) If so, has the
counter reached the limit? If so, exit. 

4) Limit not reached; loop to step 1 using the current index as a new starting
point.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to