Hi Bram!

On Do, 09 Okt 2014, Bram Moolenaar wrote:

> Christian wrote:
> 
> > > > Bram,
> > > > here is patch, that will switch from the NFA engine to the BT regular 
> > > > expression engine when there are too many states encountered. I have 
> > > > tested it on some input on which the NFA engine was slow (issue #164) 
> > > > and it worked. I made it so, that when verbose is set, Vim outputs 
> > > > whenever it switches the engines. This will however only be done, when 
> > > > 're' is set to 0
> > > > 
> > > > I am however not sure, what a good measure is, when to abort the NFA 
> > > > engine and switch to the old engine, but that could easily be switched.
> > > 
> > > Interesting idea.  I think the limit on the number of states is just a
> > > first guess.  Some states are cheap (e.g. matching one specific
> > > character), some are expensive (e.g. when doing recursive stuff). At
> > > some point this can be tuned.  Please move the decision to one central
> > > function.
> > 
> > Which decision? It is already at one central state (though not in a 
> > separate function, but that would just change one if condition by 
> > another).
> 
> You check for NFA_TOO_MANY_STATES in a few places, that name is
> misleading.  Perhaps NFA_TOO_EXPENSIVE?  That might be decided on
> something else than the number of states.

I'll update the patch later.

> > > Some things in the BT engine are also expensive, such as two multis in a
> > > row: ".*a*".  There are many alternatives to try then.
> > > 
> > > I consider this a start to make the choice of engine more clever.
> > > We also need some kind of benchmark to be able to tune the choice.
> > 
> > Updated patch includes a test, that creates a benchmark.out text file 
> > which uses issue# 164 provided file and pattern to measure performance 
> > of each 're' engine. Please note, it uses a search() call with a timout 
> > value of 10,000 milliseconds to avoid having the benchmark run forever. 
> > Unfortunately, that parameter does not help at all. We might have 
> > another bug here. Here is some sample output for my machine:
> 
> The time limit is not implemented for the NFA engine.  Would be useful
> to have.

Oh, I didn't know that. That should be documented.

> I think the benchmark should be run separately, not as part of the
> regular tests.

That was how I made it. You basically do make bench and it would run all 
defined benchmark. Idea was to drop sample files into the directory 
sample add a line in test_zzz_benchmark_re.in with a sample regexp and 
run make bench which would then run that benchmark with the output I 
showed last time.

> I hope we can make the benchmarks use specific RE features, not
> complicated real-world patterns.  So that we can pinpoint what the
> problematic feature is and possibly work on fixing that.

We could do that as well. But if we want to implement something like 
switching the 're' engine automatically, we should also have tests for 
that. And in my patch, one could see that.

Best,
Christian
-- 
Einfingersuchsystem:

Oder System "Columbus": Jede Taste eine Entdeckung. 
Oder "Terroristensystem": Jederzeit mit einem Anschlag zu rechnen.
Oder "Adlersuchsystem": Erst kreisen und dann blitzschnell zustoßen.

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Raspunde prin e-mail lui