Re: speculation about remaping

Leif Hedstrom Tue, 15 Jan 2019 09:21:50 -0800

> On Jan 15, 2019, at 10:09 AM, Walt Karas <wka...@oath.com.INVALID> wrote:
> 
> When we remap we go through the list of regexs looking for a match.
> What if we took all the regexes, removed subpattern captures, and
> created one giant pattern with alternation and capture of the regex
> for each rule, like:
> 
> (regex1)|(regex2)|...|(regexN)
> 
> We could then just step through the array of subpattern capture
> offsets to find the matching rule.  Then do a second match against
> just the original rule regex to get any captures it contains.  Would
> this work, and would it be faster?


That could potentially be a lot slower, no? Imagine you have a remap today with

        regex_map <regex1> ...
        regex_map <regex2> ...
        regex_map <regex3> ...
        regex_map <regex4> …


If you know what you are doing, and have some luck, you could order these such 
as the most common regex is matched early.  And first match wins, and 
therefore, you could stop evaluating regexes early. Even in an average case, 
you’d only evaluate half of the regexes, right?

But if you combine all these into one regex, you essentially have to evaluate 
all regexes all the time, regardless of order and regardless if something has 
matched.

— Leif
Re: speculation about remaping

Reply via email to