>>> maybe if there was some way to establish a hierachy at startup
>>> which groups rule processing into nodes. some nodes finish
>>> quickly, some have dependencies, some are negative, etc.

Just wanted to point out, this topic came out when site dns
cache service started to fail due to excessive dnsbl queries. My
slowdown was due to multiple timeouts and/or delay, probably
related to "answering joe-job rbldns backscatter" -- that's the
reason I was looking for early exit on scans in process.

There is a little of splitting rules into processing speed groups done. Specifically, the net-based tests, being dependent on external events for completion, are split out from the other tests and are processed in two phases. The first phase issues the request for information over the net, and the second phase then waits for an answer. There is a background routine that is harvesting incoming net results while other rules are processed, so when a net result is required it may already be present and no delay will be incurred.

This is not an area I understand at all fully, but reading moderately recent comments on Bugzilla leads me to believe that this is an area where some improvement is still possible; there are some net tests that (I think) end up waiting immediately for an answer rather than doing the two-phase processing. How much that slows down the result for the overall email probably depends on many factors.

Also note that even issuing the requests and then waiting for the result only when it is needed doesn't guarantee that the mail will not have to wait for results. It could be that one of the very first rules processed (due to priority ort meta dependency, for instance) will need a net result, and so the entire rule process will be forced to wait on it.

As far as splitting non-net rules up based on speed, that isn't very practical. Regex rules should in general be quite fast, and all of them are going to require the use of the processor full-time anyway. The speed of the rule will depend on how it is written and the exact content of the email it is processing. So a rule that is dog slow on one email may be blindingly fast on most other emails. I don't know that there is any good way to estimate the speed of a regex simply by looking at it.

       Loren


Reply via email to