On 3/25/2013 11:06 AM, Abhijeet Rastogi wrote: > How clever would it be to deploy in production?
I've been using it for over 3 years, the original REGEXP version for a few months, then my PCRE 'version' after that. AFAIK the ISP crew who created the original have had it in production for more than 5 years. My logs show over 100 hosts performing scheduled downloads, including Barracuda networks, though I'm sure they're doing something with it other than Postfix, either simply research, or using the expressions in their AS boxen. I only find about a dozen recognizable domains pulling the file. The rest are smaller shops and individuals. Regardless, it's in production at more than a few sites. > For every mail, checking 1600 regexes doesn't seem efficient to me. This table is actually pretty efficient. Noel's analysis is correct. If the query string will end up matching, on average only up to about 1/3rd of the expressions will be executed per table query due to the use of conditional blocks. However if the string is a bare IP or a non-dynamic/gerneric hostname, a maximum of 1-3 expressions are executed. Thus for any string that is not dynamic/generic rDNS, we skip the entire table. Doing this is beneficial not only to avoid unneeded expression processing per query, but also because the smtpd_client_restrictions used with this table query it twice per client connection. They query not only the hostname string, but they also query the client IP address. Without the first two expressions that cause bare IPs to skip the table and without conditional blocks, it would indeed be inefficient, and processing time per query would be substantially higher. However, that quantity of time difference is relative to CPU horsepower. As CPUs become even faster, the execution time difference between an efficient and a non-optimized version of this table becomes compressed to the point it makes no difference at all. Note how many daemons are written in Perl today vs C, and the proliferation of PHP, Python, JAVA, etc in server and web apps. Many people stopped writing C code simply because CPUs are so fast they mask the horrible inefficiencies of runtime interpreted code. > Will it have any significant CPU usage No. A few tens to maybe a few hundreds of µs per query. If you're unfamiliar with the symbol, µs (microsecond) is 1 millionth of a second. For comparison SA's Bayesian filter often exceeds my 15s time limit, but on average takes 2-5 seconds, a few million times slower execution than this table. I run SA post queue so it processes only about 5% of my flow. >> are you missing http://www.hardwarefreak.com/fqrdns.pcre ? :) -- Stan