Re: regexp: does anyone really need ordered alternation?

2008-03-31 Fir de Conversatie Ian Young
On Mon, Mar 31, 2008 at 9:04 AM, Antony Scriven [EMAIL PROTECTED] wrote: On 31/03/2008, Ian Young [EMAIL PROTECTED] wrote: [...] I did a couple informal benchmarks when I preparing for a short talk I gave at my school. The slides are at

Re: regexp: does anyone really need ordered alternation?

2008-03-30 Fir de Conversatie Xiaozhou Liu
On Thu, Mar 27, 2008 at 5:54 PM, Antony Scriven [EMAIL PROTECTED] wrote: On 26/03/2008, Xiaozhou Liu [EMAIL PROTECTED] wrote: Hi Vimmers, During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text

Re: regexp: does anyone really need ordered alternation?

2008-03-30 Fir de Conversatie Bram Moolenaar
Nikolai Weibull wrote: On Sat, Mar 29, 2008 at 8:14 PM, Bram Moolenaar [EMAIL PROTECTED] wrote: Considering the recent OOXML fuzz I have lowered my appreciation for standards considerably. Considering that much of what people are complaining about regarding OOXML is things that

Re: regexp: does anyone really need ordered alternation?

2008-03-30 Fir de Conversatie Ian Young
Here's another problem with changing behavior in the new engine: we would have to modify the backtracking engine as well to prevent it from using ordered alternation. Since our new engine can't handle certain things and falls back to the old one, they must behave the same. For example, we can't

Re: regexp: does anyone really need ordered alternation?

2008-03-29 Fir de Conversatie Bram Moolenaar
Matt Wozniski wrote: On Thu, Mar 27, 2008 at 2:47 PM, Bram Moolenaar wrote: Xiaozhou Liu wrote: During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab' matched, not 'abc') I know that

Re: regexp: does anyone really need ordered alternation?

2008-03-29 Fir de Conversatie Nikolai Weibull
On Sat, Mar 29, 2008 at 8:14 PM, Bram Moolenaar [EMAIL PROTECTED] wrote: Considering the recent OOXML fuzz I have lowered my appreciation for standards considerably. Considering that much of what people are complaining about regarding OOXML is things that exist in OOXML due to Office's

Re: regexp: does anyone really need ordered alternation?

2008-03-28 Fir de Conversatie Matt Wozniski
On Thu, Mar 27, 2008 at 2:47 PM, Bram Moolenaar wrote: Xiaozhou Liu wrote: During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab' matched, not 'abc') I know that 100% compatibility is one of the

Re: regexp: does anyone really need ordered alternation?

2008-03-28 Fir de Conversatie Matthew Winn
On Fri, 28 Mar 2008 02:09:19 -0400, Matt Wozniski [EMAIL PROTECTED] wrote: For what it's worth, I disagree strongly. This behavior is nothing but a bug in the existing implementation - a documented bug, but a bug nonetheless. In this particular case, I definitely think that we should

Re: regexp: does anyone really need ordered alternation?

2008-03-28 Fir de Conversatie Antony Scriven
On 28/03/2008, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: [...] Jeffrey Friedel discusses this in his book Mastering Regular Expressions in chapter 4, section NFA, DFA, and POSIX. [...] Jeffrey writes: If efficiency is an issue with a Traditional NFA (and with backtracking, believe

Re: regexp: does anyone really need ordered alternation?

2008-03-28 Fir de Conversatie Mikołaj Machowski
lurker mode off From user point of view: does this new way of treating \| bring speed gains? \| is one of the most time expensive operators in Vim. If we could have improvement here I would say - go for it. lurker mode on Piotr Żaczek na

Re: regexp: does anyone really need ordered alternation?

2008-03-28 Fir de Conversatie Nico Weber
Interesting selection of languages to try ;-) You may have picked the ones with a common regex code base. Russ Cox's article certainly shows similar performance/behaviour (See section and graph on Performance at http://swtch.com/~rsc/regexp/regexp1.html) I wonder what Tcl and awk do?

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Antony Scriven
On 26/03/2008, Xiaozhou Liu [EMAIL PROTECTED] wrote: Hi Vimmers, During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab' matched, not 'abc') I know that 100% compatibility is one of the project

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Ben Schmidt
Antony Scriven wrote: On 26/03/2008, Xiaozhou Liu [EMAIL PROTECTED] wrote: Hi Vimmers, During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab' matched, not 'abc') I know that 100%

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Antony Scriven
On 27/03/2008, Ben Schmidt [EMAIL PROTECTED] wrote: Antony Scriven wrote: [...] [...] I'd prefer the longest match rather than the first alternative (as specified by POSIX) [...] An interesting twist. Can you clarify which behaviour POSIX specifies (your sentence above is

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Nikolai Weibull
On Thu, Mar 27, 2008 at 1:16 PM, Antony Scriven [EMAIL PROTECTED] wrote: On 27/03/2008, Ben Schmidt [EMAIL PROTECTED] wrote: Antony Scriven wrote: I'd prefer the longest match rather than the first alternative (as specified by POSIX) An interesting twist. Can you clarify which

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Antony Scriven
On 27/03/2008, Nikolai Weibull [EMAIL PROTECTED] wrote: On Thu, Mar 27, 2008 at 1:16 PM, Antony Scriven [EMAIL PROTECTED] wrote: On 27/03/2008, Ben Schmidt [EMAIL PROTECTED] wrote: Antony Scriven wrote: I'd prefer the longest match rather than the first alternative

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Nikolai Weibull
On Thu, Mar 27, 2008 at 5:37 PM, Antony Scriven [EMAIL PROTECTED] wrote: On 27/03/2008, Nikolai Weibull [EMAIL PROTECTED] wrote: /left-most/ longest. Big difference. I thought the `left-most' part was a given and we were discussing which of the alternatives would be subsequently

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Xiaozhou Liu
On Thu, Mar 27, 2008 at 8:39 AM, Ben Schmidt [EMAIL PROTECTED] wrote: I prefer the behaviour which I presume you have in your NFA implementation, of preferring longer matches, just as * is greedy by default, so would actually welcome the change. Yes, that indeed is the behavior of the

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Xiaozhou Liu
On Thu, Mar 27, 2008 at 5:54 PM, Antony Scriven [EMAIL PROTECTED] wrote: I thought Russ Cox had solved this in the code on his website, or am I mistaken? Thanks for the pointer, I'm not aware of this. I'll have a look at the code. So does anyone really need this feature to be kept?

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Bram Moolenaar
Xiaozhou Liu wrote: During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab' matched, not 'abc') I know that 100% compatibility is one of the project goals. So I try to keep this feature in the new

Re: regexp: does anyone really need ordered alternation?

2008-03-27 Fir de Conversatie Xiaozhou Liu
Hi, Bram On Fri, Mar 28, 2008 at 2:47 AM, Bram Moolenaar [EMAIL PROTECTED] wrote: Xiaozhou Liu wrote: During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab' matched, not 'abc') I know that

regexp: does anyone really need ordered alternation?

2008-03-26 Fir de Conversatie Xiaozhou Liu
Hi Vimmers, During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab' matched, not 'abc') I know that 100% compatibility is one of the project goals. So I try to keep this feature in the new regexp. But the

Re: regexp: does anyone really need ordered alternation?

2008-03-26 Fir de Conversatie Ben Schmidt
Xiaozhou Liu wrote: Hi Vimmers, During the development of the new regexp, one thing confuses me a lot: ordered alternation. (e.g. given r.e. 'ab\|abc' and text 'abc', 'ab' matched, not 'abc') I know that 100% compatibility is one of the project goals. So I try to keep this feature in