Here's another problem with changing behavior in the new engine: we
would have to modify the backtracking engine as well to prevent it
from using ordered alternation.  Since our new engine can't handle
certain things and falls back to the old one, they must behave the
same. For example, we can't have aa\|aab and \(a\)\1\|aab match
different things - the horror!

Where is Russ' code that solves this problem?  I believe at one point
we discussed using a simplified type of "tagged nfa" to get around
this, but I can't find that conversation anywhere.

On Fri, Mar 28, 2008 at 9:53 AM,  <[EMAIL PROTECTED]> wrote:
>
>  If the new engine is supposed to have Posix-compatible behaviour,

It's not, but I think this is a relevant issue.  What about the idea
of having a flag to change regexp behavior from Vim style to POSIX
style (or at least a close approximation)?  Would this be useful?

Let me throw a couple different ideas out:

It's possible someone could rewrite the current engines to behave
differently depending on the desired behavior.  However, I don't know
how much work that would be.  The type of greediness should be easy to
manipulate in the new engine, but I can't speak to changing the old
engine, or to the other things that Vim does differently from POSIX.

Alternatively, the changes we made to Vim code make it fairly easy to
plug in different regexp engines at will - I got part of the way on a
PCRE-driven Vim, although I don't believe I got certain features like
multiline matching working.  It should be doable, though, and it's
possible that a PCRE (or another regexp engine) dependency could be
included at compile time or released as an unofficial patch or
something.  Again, runtime behavior could be specified with a flag.

>
>  To show that this would make a big difference, solid benchmark data
>  would be valuable. Has anyone benchmarked the new engine?

I did a couple informal benchmarks when I preparing for a short talk I
gave at my school.  The slides are at
<http://vim-soc-regexp.googlecode.com/files/ThursExtra.odp>, and near
the end are some graphs with the data I collected.  Short version: the
pathological cases mirror Russ Cox's results pretty closely. The
non-pathological cases tend to be a bit slower with the new engine
than the old, but we're talking differences of a few nanoseconds in
most cases.  A few cases had more substantial differences (in the
order of hundreds of nanoseconds), which is the reason I believe some
work should go into optimizing before we incorporate the new engine
into a release.


Cheers,
Ian

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Raspunde prin e-mail lui