Re: Pathological regular expression

2009-04-17 Thread franck
g a pathological regular expression, but rather a problem of the RE engine that is not optimized to use the faster approach when possible. This is well known problem very well explained on: http://swtch.com/~rsc/regexp/regexp1.html Cheers, Franck -- http://mail.python.org/mailman/listinfo/python-list

Re: Pathological regular expression

2009-04-11 Thread John Machin
On Apr 12, 10:31 am, Steven D'Aprano wrote: > On Sat, 11 Apr 2009 16:46:20 -0700, John Machin wrote: > > On Apr 12, 3:40 am, Steven D'Aprano > cybersource.com.au> wrote: > >> On Sat, 11 Apr 2009 08:40:03 -0700, John Machin wrote: > >> >> To my mind, this is a bug in the RE engine. Is there any re

Re: Pathological regular expression

2009-04-11 Thread Aaron Brady
On Apr 11, 7:31 pm, Steven D'Aprano wrote: _ > My original test has now been running for close to ten hours now, and > still can't be interrupted with ctrl-C. However that's in Python 2.5, > having tried it in Python 2.6.2 they can be interrupted, so I'm satisfied > that this bug of "regex hangs t

Re: Pathological regular expression

2009-04-11 Thread Steven D'Aprano
On Sat, 11 Apr 2009 16:46:20 -0700, John Machin wrote: > On Apr 12, 3:40 am, Steven D'Aprano cybersource.com.au> wrote: >> On Sat, 11 Apr 2009 08:40:03 -0700, John Machin wrote: >> >> To my mind, this is a bug in the RE engine. Is there any reason to >> >> not treat it as a bug? >> >> > IMHO it's

Re: Pathological regular expression

2009-04-11 Thread John Machin
On Apr 12, 9:46 am, John Machin wrote: >             result = _re_comments_nc.sub(r"\1", line) s/_nc// ... that's an artifact of timing it with Non-Capture groups (?:blahblah) on the two internal groups that don't need to be capturing (results identical, and no perceptible effect on running time

Re: Pathological regular expression

2009-04-11 Thread John Machin
On Apr 12, 3:40 am, Steven D'Aprano wrote: > On Sat, 11 Apr 2009 08:40:03 -0700, John Machin wrote: > >> To my mind, this is a bug in the RE engine. Is there any reason to not > >> treat it as a bug? > > > IMHO it's not a bug -- s/hang/takes a long time to compute/ > > > Just look at it: 2 + opera

Re: Pathological regular expression

2009-04-11 Thread MRAB
Steven D'Aprano wrote: On Sat, 11 Apr 2009 08:40:03 -0700, John Machin wrote: To my mind, this is a bug in the RE engine. Is there any reason to not treat it as a bug? IMHO it's not a bug -- s/hang/takes a long time to compute/ Just look at it: 2 + operators and 3 * operators ... It's one of

Re: Pathological regular expression

2009-04-11 Thread Aaron Brady
On Apr 11, 12:40 pm, Steven D'Aprano wrote: > On Sat, 11 Apr 2009 08:40:03 -0700, John Machin wrote: > >> To my mind, this is a bug in the RE engine. Is there any reason to not > >> treat it as a bug? > > > IMHO it's not a bug -- s/hang/takes a long time to compute/ > > > Just look at it: 2 + oper

Re: Pathological regular expression

2009-04-11 Thread Dotan Cohen
> Well, it's been running now for about two and a half hours, that's a > rather long lunch. I'd also like a pony! -- Dotan Cohen http://what-is-what.com http://gibberish.co.il -- http://mail.python.org/mailman/listinfo/python-list

Re: Pathological regular expression

2009-04-11 Thread Steven D'Aprano
On Sat, 11 Apr 2009 08:40:03 -0700, John Machin wrote: >> To my mind, this is a bug in the RE engine. Is there any reason to not >> treat it as a bug? > > IMHO it's not a bug -- s/hang/takes a long time to compute/ > > Just look at it: 2 + operators and 3 * operators ... It's one of those > "com

Re: Pathological regular expression

2009-04-11 Thread Aaron Brady
On Apr 11, 10:07 am, Steven D'Aprano wrote: > On Thu, 09 Apr 2009 02:56:00 -0700, David Liang wrote: > > Hi all, > > I'm having a weird problem with a regular expression (tested in 2.6 and > > 3.0): > > > Basically, any of these: > > _re_comments = re.compile(r'^(([^\\]+|\\.|"([^"\\]+|\\.)*")*)#.*

Re: Pathological regular expression

2009-04-11 Thread MRAB
Dotan Cohen wrote: IMHO it's not a bug -- s/hang/takes a long time to compute/ ‎That is quite what a hang is, and why the timeout was invented. The real bug is that there is no timeout mechanism. I wouldn't call it a "hang" because it is actually doing work. If it was 'stuck' on a certain pa

Re: Pathological regular expression

2009-04-11 Thread Dotan Cohen
> IMHO it's not a bug -- s/hang/takes a long time to compute/ > ‎That is quite what a hang is, and why the timeout was invented. The real bug is that there is no timeout mechanism. > Just look at it: 2 + operators and 3 * operators ... It's one of those > "come back after lunch" REs. > Some user

Re: Pathological regular expression

2009-04-11 Thread John Machin
On Apr 12, 1:07 am, Steven D'Aprano wrote: > On Thu, 09 Apr 2009 02:56:00 -0700, David Liang wrote: > > Hi all, > > I'm having a weird problem with a regular expression (tested in 2.6 and > > 3.0): > > > Basically, any of these: > > _re_comments = re.compile(r'^(([^\\]+|\\.|"([^"\\]+|\\.)*")*)#.*$

Re: Pathological regular expression

2009-04-11 Thread MRAB
Steven D'Aprano wrote: On Thu, 09 Apr 2009 02:56:00 -0700, David Liang wrote: Hi all, I'm having a weird problem with a regular expression (tested in 2.6 and 3.0): Basically, any of these: _re_comments = re.compile(r'^(([^\\]+|\\.|"([^"\\]+|\\.)*")*)#.*$') _re_comments = re.compile(r'^(([^#]+|

Re: Pathological regular expression

2009-04-11 Thread Steven D'Aprano
On Thu, 09 Apr 2009 02:56:00 -0700, David Liang wrote: > Hi all, > I'm having a weird problem with a regular expression (tested in 2.6 and > 3.0): > > Basically, any of these: > _re_comments = re.compile(r'^(([^\\]+|\\.|"([^"\\]+|\\.)*")*)#.*$') > _re_comments = re.compile(r'^(([^#]+|\\.|"([^"\\]

Re: Pathological regular expression

2009-04-09 Thread David Liang
On Apr 9, 2:56 am, David Liang wrote: > Hi all, > I'm having a weird problem with a regular expression (tested in 2.6 > and 3.0): > > Basically, any of these: > _re_comments = re.compile(r'^(([^\\]+|\\.|"([^"\\]+|\\.)*")*)#.*$') > _re_comments = re.compile(r'^(([^#]+|\\.|"([^"\\]+|\\.)*")*)#.*$')

Re: Pathological regular expression

2009-04-09 Thread David Liang
On Apr 9, 2:56 am, David Liang wrote: > Hi all, > I'm having a weird problem with a regular expression (tested in 2.6 > and 3.0): > > Basically, any of these: > _re_comments = re.compile(r'^(([^\\]+|\\.|"([^"\\]+|\\.)*")*)#.*$') > _re_comments = re.compile(r'^(([^#]+|\\.|"([^"\\]+|\\.)*")*)#.*$')

Pathological regular expression

2009-04-09 Thread David Liang
Hi all, I'm having a weird problem with a regular expression (tested in 2.6 and 3.0): Basically, any of these: _re_comments = re.compile(r'^(([^\\]+|\\.|"([^"\\]+|\\.)*")*)#.*$') _re_comments = re.compile(r'^(([^#]+|\\.|"([^"\\]+|\\.)*")*)#.*$') _re_comments = re.compile(r'^(([^"]+|\\.|"([^"\\]+|\