On 3/25/2013 11:06 AM, Abhijeet Rastogi wrote:
> How clever would it be to deploy in production?

I've been using it for over 3 years, the original REGEXP version for a
few months, then my PCRE 'version' after that.  AFAIK the ISP crew who
created the original have had it in production for more than 5 years.
My logs show over 100 hosts performing scheduled downloads, including
Barracuda networks, though I'm sure they're doing something with it
other than Postfix, either simply research, or using the expressions in
their AS boxen.  I only find about a dozen recognizable domains pulling
the file.  The rest are smaller shops and individuals.  Regardless, it's
in production at more than a few sites.

> For every mail, checking 1600 regexes doesn't seem efficient to me.

This table is actually pretty efficient.  Noel's analysis is correct.
If the query string will end up matching, on average only up to about
1/3rd of the expressions will be executed per table query due to the use
of conditional blocks.  However if the string is a bare IP or a
non-dynamic/gerneric hostname, a maximum of 1-3 expressions are
executed.  Thus for any string that is not dynamic/generic rDNS, we skip
the entire table.

Doing this is beneficial not only to avoid unneeded expression
processing per query, but also because the smtpd_client_restrictions
used with this table query it twice per client connection.  They query
not only the hostname string, but they also query the client IP address.
 Without the first two expressions that cause bare IPs to skip the table
and without conditional blocks, it would indeed be inefficient, and
processing time per query would be substantially higher.

However, that quantity of time difference is relative to CPU horsepower.
 As CPUs become even faster, the execution time difference between an
efficient and a non-optimized version of this table becomes compressed
to the point it makes no difference at all.  Note how many daemons are
written in Perl today vs C, and the proliferation of PHP, Python, JAVA,
etc in server and web apps.  Many people stopped writing C code simply
because CPUs are so fast they mask the horrible inefficiencies of
runtime interpreted code.

> Will it have any significant CPU usage

No.  A few tens to maybe a few hundreds of µs per query.  If you're
unfamiliar with the symbol, µs (microsecond) is 1 millionth of a second.
 For comparison SA's Bayesian filter often exceeds my 15s time limit,
but on average takes 2-5 seconds, a few million times slower execution
than this table.  I run SA post queue so it processes only about 5% of
my flow.

>> are you missing http://www.hardwarefreak.com/fqrdns.pcre ? :)

-- 
Stan

Reply via email to