[issue19055] Regular expressions: * does not match as many repetitions as possible.

2013-09-20 Thread Jason Stumpf

Jason Stumpf added the comment:

I like that clearer description.  as produce matches is more correct than as 
possible.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19055
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19055] Regular expressions: * does not match as many repetitions as possible.

2013-09-19 Thread Jason Stumpf

New submission from Jason Stumpf:

 re.match('(a|ab)*',('aba')).group(0)
'a'

According to the documentation, the * should match as many repetitions as 
possible.  2 are possible, it matches 1.

Reversing the order of the operands of | changes the behaviour.

 re.match('(ab|a)*',('aba')).group(0)
'aba'

--
messages: 198116
nosy: Jason.Stumpf
priority: normal
severity: normal
status: open
title: Regular expressions: * does not match as many repetitions as possible.
type: behavior
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19055
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19055] Regular expressions: * does not match as many repetitions as possible.

2013-09-19 Thread Jason Stumpf

Changes by Jason Stumpf jstu...@google.com:


--
components: +Regular Expressions
nosy: +ezio.melotti, mrabarnett

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19055
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19055] Regular expressions: * does not match as many repetitions as possible.

2013-09-19 Thread Jason Stumpf

Jason Stumpf added the comment:

Even with the documentation to |, the documentation to * is wrong.

 re.match('(a|ab)*c',('abac')).group(0)
'abac'

From the doc: In general, if a string p matches A and another string q matches 
B, the string pq will match AB.

Since '(a|ab)*c' matches 'abac', and 'c' matches 'c', that means '(a|ab)*' 
matches 'aba'.  It does so with 2 repetitions.  Thus, in the example from my 
initial post, it was not matching with as many repetitions as possible.

I think what you mean is that * attempts to match again after each match of the 
preceding regular expression.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19055
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19055] Regular expressions: * does not match as many repetitions as possible.

2013-09-19 Thread Jason Stumpf

Jason Stumpf added the comment:

Sorry, that implication was backwards.  I don't think I can prove from just the 
documentation that '(a|ab)*' can match 'aba' in certain contexts.

If the docs said: * attempts to match again after each match of the preceding 
regular expression. I think it would describe the observed behaviour.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19055
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19055] Regular expressions: * does not match as many repetitions as possible.

2013-09-19 Thread Jason Stumpf

Jason Stumpf added the comment:

I understand what's happening, but that is not what the documentation 
describes.  If the behaviour is correct, the documentation is incorrect.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue19055
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com