Here's another problem with changing behavior in the new engine: we would have to modify the backtracking engine as well to prevent it from using ordered alternation. Since our new engine can't handle certain things and falls back to the old one, they must behave the same. For example, we can't have aa\|aab and \(a\)\1\|aab match different things - the horror!
Where is Russ' code that solves this problem? I believe at one point we discussed using a simplified type of "tagged nfa" to get around this, but I can't find that conversation anywhere. On Fri, Mar 28, 2008 at 9:53 AM, <[EMAIL PROTECTED]> wrote: > > If the new engine is supposed to have Posix-compatible behaviour, It's not, but I think this is a relevant issue. What about the idea of having a flag to change regexp behavior from Vim style to POSIX style (or at least a close approximation)? Would this be useful? Let me throw a couple different ideas out: It's possible someone could rewrite the current engines to behave differently depending on the desired behavior. However, I don't know how much work that would be. The type of greediness should be easy to manipulate in the new engine, but I can't speak to changing the old engine, or to the other things that Vim does differently from POSIX. Alternatively, the changes we made to Vim code make it fairly easy to plug in different regexp engines at will - I got part of the way on a PCRE-driven Vim, although I don't believe I got certain features like multiline matching working. It should be doable, though, and it's possible that a PCRE (or another regexp engine) dependency could be included at compile time or released as an unofficial patch or something. Again, runtime behavior could be specified with a flag. > > To show that this would make a big difference, solid benchmark data > would be valuable. Has anyone benchmarked the new engine? I did a couple informal benchmarks when I preparing for a short talk I gave at my school. The slides are at <http://vim-soc-regexp.googlecode.com/files/ThursExtra.odp>, and near the end are some graphs with the data I collected. Short version: the pathological cases mirror Russ Cox's results pretty closely. The non-pathological cases tend to be a bit slower with the new engine than the old, but we're talking differences of a few nanoseconds in most cases. A few cases had more substantial differences (in the order of hundreds of nanoseconds), which is the reason I believe some work should go into optimizing before we incorporate the new engine into a release. Cheers, Ian --~--~---------~--~----~------------~-------~--~----~ You received this message from the "vim_dev" maillist. For more information, visit http://www.vim.org/maillist.php -~----------~----~----~----~------~----~------~--~---