Ma Lin <malin...@163.com> added the comment:
For a capture group, state->mark[] array stores it's begin and end: begin: state->mark[(group_number-1)*2] end: state->mark[(group_number-1)*2+1] So state->mark[0] is the begin of the first capture group. state->mark[1] is the end of the first capture group. re.search(r'(ab|a)*?b', 'ab') In this case, here is a simplified actions record: 01 MARK 0 02 "a": first "a" in the pattern [SUCCESS] 03 BRANCH 04 "b": first "b" in the pattern [SUCCESS] 05 MARK 1 06 "b": second "b" in the pattern [FAIL] 07 try next (ab|a)*? [FAIL] 08 MARK 0 09 "a": first "a" in the pattern [FAIL] 10 BRANCH: try next branch 11 "": the second branch [SUCCESS] 12 MARK 1 13 "b" [SUCCESS]: second "b" in the pattern MARK_PUSH(lastmark) macro didn't protect MARK-0 if it was the only available mark, while the BRANCH op uses this macro to protect capture groups before trying a branch. So capture group 1 is [MARK-0 at Line-08, MARK-1 at line-12), this is wrong. The correct capture group 1 should be [MARK-0 at Line-01, MARK-1 at line-12). ---------- versions: +Python 3.7, Python 3.8 -Python 3.5 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue35859> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com