> On Apr 4, 2016, at 9:21 PM, Devon H. O'Dell <[email protected]> wrote:
> 
> ## PCRE
> 
> There are other things we've done (like optimizing regexes that are
> obviously prefix and suffix matches -- turns out lots of people write
> things like `if (req.http.x-foo ~ "^bar.*$")` that are effectively `if
> (strncmp(req.http.x-foo, "bar" 3))` because it's easy), but I don't
> see those as being as high priority for upstream; they're largely
> issues for our multi-tenant use case. We have done this already;
> another thing we would like to do is to check regexes for things like
> backtracking and use DFA-based matching where possible. In the flame
> graph screenshot, the obvious VRT functions are PCRE.

You might be interested in this, although it's new as can be (just today tagged 
as v0.1) -- a VMOD to access Google's RE2 regular expression lib:

https://code.uplex.de/uplex-varnish/libvmod-re2

For those not familiar with RE2: it limits the syntax so that patterns are 
regular languages in the strictly formal sense. Most notably, backrefs within a 
pattern are not allowed. That means that the matcher can run as DFAs/NFAs, 
there is never any backtracking, and the time requirement for matches is always 
linear in the length of the string to be matched.

So far this is just a proof of concept, and I haven't done any performance 
testing. From the documentation, I suspect that there are certain kinds of use 
cases for Varnish where RE2 would perform better than PCRE, and many cases 
where it doesn't make much difference (such as the prefix or suffix matches you 
mentioned). But that's all speculation until it's been tested.


Best,
Geoff



_______________________________________________
varnish-dev mailing list
[email protected]
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev

Reply via email to