On Thu, Mar 27, 2008 at 2:47 PM, Bram Moolenaar wrote:
>  Xiaozhou Liu wrote:
>
>  > During the development of the new regexp, one thing confuses me a lot:
>  > ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab'
>  > matched, not 'abc')
>  >
>  > I know that 100% compatibility is one of the project goals. So I try
>  > to keep this feature
>  > in the new regexp. But the problem is, ordered alternation is kind of
>  > 'side effect'
>  > of the original back track regexp matcher. AFAIK, It is very hard to
>  > implement this
>  > feature in the new, truly NFA matcher, if it is not impossible. We can 
> resort
>  > to the original regexp when we see '\|',  but we don't solve the
>  > problem perfectly.
>  >
>  > So does anyone really need this feature to be kept? If so, please do tell 
> me.
>  > For me, the removal of this 'feature' won't break anything.
>
>  It is close to impossible to check that a change like this doesn't break
>  existing scripts.  And when something breaks, e.g. a syntax file, a
>  normal user is very unlikely to be able to figure out what caused the
>  problem.
>
>  I stick to the opinion that the new regexp engine must work exactly
>  like the existing one.  Most things can be made to work that way.  I
>  also thought that this behavior of an alternate branch could be made to
>  work in a DFA, with some effort.  And otherwise we would have to fall
>  back to the old engine when there is an alternate branch in the regexp.

For what it's worth, I disagree strongly.  This behavior is nothing
but a bug in the existing implementation - a documented bug, but a bug
nonetheless.  In this particular case, I definitely think that we
should strive for compatibility with other regex engines, rather than
backwards compatibility with older vim versions.  And, since this new
regex engine would likely not be introduced until vim 8.0, there is no
better time to break backwards compatibility.  Since the old guarantee
that the leftmost alternation is matched first in fact makes regexes
work differently than in Perl, and in every POSIX implementation, and
make a DFA-based regex engine harder to write, I think we should
consider them as a historical fluke, and any syntax files that rely on
them to have been too cozy with the current engine.  We should
re-write such syntax files, rather than keep a useful improvement out
of vim for their sake.

~Matt

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Raspunde prin e-mail lui