On Fri, Sep 18, 2009 at 05:17:39PM -0400, Miguel Pilar Vilagran wrote: > Regardless of how big a cluster...@! This seems to be, you're much, much > better off using a regex.
Well, I don't agree. It depends on the ability to factor the rules, and of course on the regex library. Having a lot of patterns in a single ACL is pretty fast, they're all just checked in a loop, and haproxy knows where the end is. Also, one thing that can be done is to enumerate them from the most common to the least common. Using a regex, you'd need to match a large set of values prefixed with '.*' (bad) and separated by '|'. I think the regex might help when all values have a large common prefix, because the they will be evaluated as a tree. But here it's not the case at all. If someone is interested in doing the test, I'd really suggest building with libpcre which is blazingly fast. Oh I've just run this ACL in a test config as-is (without regex). Without the ACL, the test runs at 25900 hits/s. With the ACL, the performance drops to 25660, which means approximately 1% performance hit. I think is pretty much acceptable. I have tried to add the ACL 10 times (450 patterns) to get a better figure. Performance drops to 24500 hits/s, or about 5.4%. We're at 2.2 us for 450 patterns which translate into 4.9 ns or 16 CPU cycles per pattern (my machine is a C2D at 3.2 GHz). That means it could test about 200 million patterns per second. I don't think we can reach that level with a complex regex. > acl url_block path_end .ad .adprototype .asa .asax .ascx .axd .browser .cd > .cdx .cer .compiled .config .cs .csproj .dd .exclude .idc .java .jsl .ldb > .ldd .lddprototype .ldf .licx .master .mdb .mdf .msgx .phps .refresh .rem > .resources .resx .sd .sdm .sdmDocument .sitemap .skin .soap .svc .vb .vbproj > .vjsproj .vsdisco .webinfo Regards, Willy

