[issue14460] In re's positive lookbehind assertion repetition works
Changes by Serhiy Storchaka storch...@gmail.com: -- resolution: - not a bug stage: - resolved status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
Serhiy Storchaka added the comment: Technically this is not a bug. -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
Matthew Barnett added the comment: Lookarounds can contain capture groups: import re re.search(r'a(?=(.))', 'ab').groups() ('b',) re.search(r'(?=(.))b', 'ab').groups() ('a',) so lookarounds that are optional or can have no repeats might have a use. I'm not sure whether it's useful to repeat them more than once, but that's another matter. I'd say that it's not a bug. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
py.user added the comment: m = re.search(r'(?=(a)){10}bc', 'abc', re.DEBUG) max_repeat 10 10 assert -1 subpattern 1 literal 97 literal 98 literal 99 m.group() 'bc' m.groups() ('a',) It works like there are 10 letters a before letter b. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
Matthew Barnett added the comment: Lookarounds can capture, but they don't consume. That lookbehind is matching the same part of the string every time. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
Tim Peters added the comment: I would not call this a bug - it's just usually a silly thing to do ;-) Note, e.g., that p{N} is shorthand for writing p N times. For example, p{4} is much the same as (but not exactly so in all cases; e.g., if `p` happens to contain a capturing group, the numbering of all capturing groups will differ between those two spellings). A successful assertion generally matches an empty string (does not advance the position being looked at in the target string). So, e.g., if we're at some point in the target string where (?=a) matches, then (?=a)(?=a) will also match at the same point, and so will (?=a)(?=a)(?=a) and (?=a)(?=a)(?=a)(?=a) and so on so on. The position in the target string never changes, so each redundant assertion succeeds too. So (?=a){N} _should_ match there too. It works like there are 10 letters a before letter b. It's much more like you're asking whether a appears before b, but are rather pointlessly asking the same question 10 times ;-) -- nosy: +tim.peters ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
py.user added the comment: Tim Peters wrote: (?=a)(?=a)(?=a)(?=a) There are four different points. If a1 before a2 and a2 before a3 and a3 before a4 and a4 before something. Otherwise repetition of assertion has no sense. If it has no sense, there should be an exception. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
Tim Peters added the comment: (?=a)(?=a)(?=a)(?=a) There are four different points. If a1 before a2 and a2 before a3 and a3 before a4 and a4 before something. Sorry, that view doesn't make any sense. A successful lookbehind assertion matches the empty string. Same as the regexp ()()()() matches 4 empty strings (and all the _same_ empty string) at any point. Otherwise repetition of assertion has no sense. As I said before, it's usually a silly thing to do. It does make sense, just not _useful_ sense - it's silly ;-) If it has no sense, there should be an exception. Why? Code like i += 0 is usually pointless too, but it's not up to a programming language to force you to code only useful things. It's easy to write to write regexps that are pointless. For example, the regexp (?=a)b can never succeed. Should that raise an exception? Or should the regexp (?=a)a raise an exception because the (?=a) part is redundant? Etc. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
Tim Peters added the comment: BTW, note that the idea successful lookaround assertions match an empty string isn't just a figure of speech: it's the literal truth, and - indeed - is key to understanding what happens here. You can see this by adding some capturing groups around the assertions. Like so: m = re.search(((?=a))((?=a))((?=a))((?=a))b, xab) Then [m.span(i) for i in range(1, 5)] produces [(2, 2), (2, 2), (2, 2), (2, 2)] That is, each assertion matched (the same) empty string immediately preceding b in the target string. This makes perfect sense - although it may not be useful. So I think this report should be closed with so if it bothers you, don't do it ;-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
py.user added the comment: Tim Peters wrote: Should that raise an exception? i += 0 (?=a)b (?=a)a These are another cases. The first is very special. The second and third are special too, but with different contents of assertion they can do useful work. While (?=any contents){N}a never uses the {N} part in any useful manner. So I think this report should be closed I looked into Perl behaviour today, it works like Python. It's not an error there. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
Mark Lawrence added the comment: Can someone comment on this regex problem please, they're just not my cup of tea. -- nosy: +BreamoreBoy versions: +Python 2.7, Python 3.4, Python 3.5 -Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14460] In re's positive lookbehind assertion repetition works
New submission from py.user port...@yandex.ru: import re re.search(r'(?=a){100,200}bc', 'abc', re.DEBUG) max_repeat 100 200 assert -1 literal 97 literal 98 literal 99 _sre.SRE_Match object at 0xb7429f38 re.search(r'(?=a){100,200}bc', 'abc', re.DEBUG).group() 'bc' I expected nothing to repeat -- components: Regular Expressions messages: 157264 nosy: ezio.melotti, mrabarnett, py.user priority: normal severity: normal status: open title: In re's positive lookbehind assertion repetition works type: behavior versions: Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue14460 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com