Re: more efficent big scoring

Loren Wilton Tue, 22 Jan 2008 17:54:07 -0800

>>> maybe if there was some way to establish a hierachy at startup
>>> which groups rule processing into nodes. some nodes finish
>>> quickly, some have dependencies, some are negative, etc.


Just wanted to point out, this topic came out when site dns
cache service started to fail due to excessive dnsbl queries. My
slowdown was due to multiple timeouts and/or delay, probably
related to "answering joe-job rbldns backscatter" -- that's the
reason I was looking for early exit on scans in process.

There is a little of splitting rules into processing speed groups done.Specifically, the net-based tests, being dependent on external events forcompletion, are split out from the other tests and are processed in twophases. The first phase issues the request for information over the net,and the second phase then waits for an answer. There is a backgroundroutine that is harvesting incoming net results while other rules areprocessed, so when a net result is required it may already be present and nodelay will be incurred.

This is not an area I understand at all fully, but reading moderately recentcomments on Bugzilla leads me to believe that this is an area where someimprovement is still possible; there are some net tests that (I think) endup waiting immediately for an answer rather than doing the two-phaseprocessing. How much that slows down the result for the overall emailprobably depends on many factors.

Also note that even issuing the requests and then waiting for the resultonly when it is needed doesn't guarantee that the mail will not have to waitfor results. It could be that one of the very first rules processed (due topriority ort meta dependency, for instance) will need a net result, and sothe entire rule process will be forced to wait on it.

As far as splitting non-net rules up based on speed, that isn't verypractical. Regex rules should in general be quite fast, and all of them aregoing to require the use of the processor full-time anyway. The speed ofthe rule will depend on how it is written and the exact content of the emailit is processing. So a rule that is dog slow on one email may be blindinglyfast on most other emails. I don't know that there is any good way toestimate the speed of a regex simply by looking at it.


       Loren

Re: more efficent big scoring

Reply via email to