Re: [patch] switch nfa regexp engine to bt engine when too many states are encountered

Bram Moolenaar Thu, 09 Oct 2014 04:12:49 -0700

Christian wrote:

> > > Bram,
> > > here is patch, that will switch from the NFA engine to the BT regular 
> > > expression engine when there are too many states encountered. I have 
> > > tested it on some input on which the NFA engine was slow (issue #164) 
> > > and it worked. I made it so, that when verbose is set, Vim outputs 
> > > whenever it switches the engines. This will however only be done, when 
> > > 're' is set to 0
> > > 
> > > I am however not sure, what a good measure is, when to abort the NFA 
> > > engine and switch to the old engine, but that could easily be switched.
> > 
> > Interesting idea.  I think the limit on the number of states is just a
> > first guess.  Some states are cheap (e.g. matching one specific
> > character), some are expensive (e.g. when doing recursive stuff). At
> > some point this can be tuned.  Please move the decision to one central
> > function.
> 
> Which decision? It is already at one central state (though not in a 
> separate function, but that would just change one if condition by 
> another).


You check for NFA_TOO_MANY_STATES in a few places, that name is
misleading.  Perhaps NFA_TOO_EXPENSIVE?  That might be decided on
something else than the number of states.

> > Some things in the BT engine are also expensive, such as two multis in a
> > row: ".*a*".  There are many alternatives to try then.
> > 
> > I consider this a start to make the choice of engine more clever.
> > We also need some kind of benchmark to be able to tune the choice.
> 
> Updated patch includes a test, that creates a benchmark.out text file 
> which uses issue# 164 provided file and pattern to measure performance 
> of each 're' engine. Please note, it uses a search() call with a timout 
> value of 10,000 milliseconds to avoid having the benchmark run forever. 
> Unfortunately, that parameter does not help at all. We might have 
> another bug here. Here is some sample output for my machine:

The time limit is not implemented for the NFA engine.  Would be useful
to have.

I think the benchmark should be run separately, not as part of the
regular tests.

I hope we can make the benchmarks use specific RE features, not
complicated real-world patterns.  So that we can pinpoint what the
problematic feature is and possibly work on fixing that.


-- 
A man is incomplete until he's married ... and then he's finished!

 /// Bram Moolenaar -- [email protected] -- http://www.Moolenaar.net   \\\
///        sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\  an exciting new programming language -- http://www.Zimbu.org        ///
 \\\            help me help AIDS victims -- http://ICCF-Holland.org    ///

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [patch] switch nfa regexp engine to bt engine when too many states are encountered

Raspunde prin e-mail lui