[issue16203] Proposal: add re.fullmatch() method
Changes by Serhiy Storchaka storch...@gmail.com: -- assignee: - serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Roundup Robot added the comment: New changeset 89dfa2671c83 by Serhiy Storchaka in branch 'default': Issue #16203: Add re.fullmatch() function and regex.fullmatch() method, http://hg.python.org/cpython/rev/89dfa2671c83 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Serhiy Storchaka added the comment: Committed with additional test (re.fullmatch('a+', 'ab')) which proves that change for SRE_OP_REPEAT_ONE are needed. Thank you Matthew for your contribution. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Changes by Serhiy Storchaka storch...@gmail.com: -- resolution: - fixed stage: patch review - committed/rejected ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Changes by Serhiy Storchaka storch...@gmail.com: -- status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Serhiy Storchaka added the comment: Matthew, could you please answer my question? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: I don't know that it's not needed. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file32370/issue16203_mrab_3.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Changes by Serhiy Storchaka storch...@gmail.com: -- keywords: -easy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Serhiy Storchaka added the comment: Patch updated to current tip. I have added some changes from the review and have added some tests. Matthew, why change for SRE_OP_REPEAT_ONE is needed? Tests are passed without it. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Georg Brandl added the comment: I updated the patch to current tip, fixed three issues from the review, and added documentation updates. -- nosy: +georg.brandl Added file: http://bugs.python.org/file32076/issue16203_mrab_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Roundup Robot added the comment: New changeset b51218966201 by Georg Brandl in branch 'default': Add re.fullmatch() function and regex.fullmatch() method, which anchor the http://hg.python.org/cpython/rev/b51218966201 -- nosy: +python-dev resolution: - fixed stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Georg Brandl added the comment: Sorry, accidental push, already reverted. -- resolution: fixed - status: closed - open ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Changes by Nick Coghlan ncogh...@gmail.com: -- stage: committed/rejected - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Antoine Pitrou added the comment: Serhiy, sorry to ping you, but do you think you're gonna look at this? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Antoine Pitrou added the comment: 3 of the tests expect None when using 'fullmatch'; they won't return None when using 'match'. Sorry, my bad. Like Serhiy, I can't comment on the changes to re internals, but we can trust you on this. The patch needs documentation, though. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Serhiy Storchaka added the comment: I can't comment right now, but I am going inspect thoroughly re internals. This is a new feature and we have enough time before 3.4 freeze. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Serhiy Storchaka added the comment: I did not analyze the patch deeply, only left on Rietveld comments on first sight. Need to update the documentation. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Antoine Pitrou added the comment: The patch doesn't seem to include failure cases for fullmatch (i.e. cases where fullmatch returns None where match or search would return a match). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: 3 of the tests expect None when using 'fullmatch'; they won't return None when using 'match'. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: I've attached a patch. -- Added file: http://bugs.python.org/file28955/issue16203_mrab.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Antoine Pitrou added the comment: Thanks for the patch. While an internal flag may be a reasonable implementation strategy, IMHO a dedicated method still makes sense: it's simply more readable than passing a flag. As for the tests, they should probably exercise the interaction with re.MULTILINE - see MRAB's comment in msg172775. -- stage: needs patch - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Ezra Berch added the comment: Patch attached. I've taken a slightly different approach than what has been discussed here: rather than define a new fullmatch() function and method, I've defined a new re.FULLMATCH flag for match(). So an example would be re.match('abc','abc',re.FULLMATCH) The implementation is basically what has been discussed here, except done when the regular expression is compiled rather than at the user level. -- keywords: +patch nosy: +ezberch Added file: http://bugs.python.org/file28140/issue16203.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Changes by Serhiy Storchaka storch...@gmail.com: -- stage: - needs patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Changes by Ezio Melotti ezio.melo...@gmail.com: -- components: +Regular Expressions nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Antoine Pitrou added the comment: FWIW, I prefer fullmatch as well :) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: I'm about to add this to my regex implementation and, naturally, I want it to have the same name for compatibility. However, I'm not that keen on fullmatch and would prefer matchall instead. What do you think? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Tim Peters added the comment: I like matchall fine, but I can't channel Guido on names - he sometimes gets those wrong ;-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Guido van Rossum added the comment: re.matchall() would appear to be related to re.findall(), which it isn't. The re2 package has a FullMatch method: http://code.google.com/p/re2/wiki/CplusplusAPI -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: re2's FullMatch method contrasts with its PartialMatch method, which re doesn't have! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Guido van Rossum added the comment: But my other argument stands. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: OK, in order to avoid bikeshedding, fullmatch it is. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: Tim, my point is that if the MULTILINE flag happens to be turned on, '$' won't just match at the end of the string (or slice), it'll also match at a newline, so wrapping the pattern in (?:...)$ in that case could give the wrong answer, but wrapping it in (?:...)\Z would give the right answer. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Antoine Pitrou added the comment: Tim, my point is that if the MULTILINE flag happens to be turned on, '$' won't just match at the end of the string (or slice), it'll also match at a newline, so wrapping the pattern in (?:...)$ in that case could give the wrong answer, but wrapping it in (?:...)\Z would give the right answer. This means Tim and Guido are right that a dedicated fullmatch() method is desireable. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Serhiy Storchaka added the comment: Definitely this is not easy issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Tim Peters added the comment: Serhiy, I expect this is easy to implement _inside_ the regexp engine. The complications come from trying to do it outside the engine. But even there, wrapping the original regexp re in (?:re)\Z is at worst very close. The only insecurity with that I've thought of concerns the doc's warnings about what can appear before an inline re.VERBOSE flag. It probably works fine even if re does begin with (?...x). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: It certainly appears to ignore the whitespace, even if the (?x) is at the end of the pattern or in the middle of a group. Another point we need to consider is that the user might want to use a pre-compiled pattern. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Matthew Barnett added the comment: '$' will match at the end of the string or just before the final '\n': re.match(r'abc$', 'abc\n') _sre.SRE_Match object at 0x00F15448 So shouldn't you be using r'\Z' instead? re.match(r'abc\Z', 'abc') _sre.SRE_Match object at 0x00F15410 re.match(r'abc\Z', 'abc\n') And what happens if the MULTILINE flag is turned on? re.match(r'abc$', 'abc\ndef', flags=re.MULTILINE) _sre.SRE_Match object at 0x00F15448 re.match(r'abc\Z', 'abc\ndef', flags=re.MULTILINE) -- nosy: +mrabarnett ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Tim Peters added the comment: Matthew, Guido wrote check that the whole input string matches (or slice if pos and (possibly also) endpos is/are given). So, yes, \Z is more to the point than $ if people want to continue wasting time trying to implement this as a Python-level function ;-) I don't understand what you're asking about MULTILINE. What's the issue there? Focus on Guido's whole input string matches, not on his motivational talk about a regex ending in $. $ and/or \Z aren't the point here; whole input string matches is the point. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
New submission from Guido van Rossum: I've noticed a subtle bug in some of our internal code. Someone wants to ensure that a certain string (e.g. a URL path) matches a certain pattern in its entirety. They use re.match() with a regex ending in $. Fine. Now someone else comes along and modifies the pattern. Somehow the $ gets lost, or the pattern develops a set of toplevel choices that don't all end in $. And now things that have a valid *prefix* suddenly (and unintentionally) start matching. I think this is a common enough issue and propose a new API: a fullmatch() function and method that work just like the existing match() function and method but also check that the whole input string matches. This can be implemented slightly awkwardly as follows in user code: def fullmatch(regex, input, flags=0): m = re.match(regex, input, flags) if m is not None and m.end() == len(input): return m return None (The corresponding method will have to be somewhat more complex because the underlying match() method takes optional pos and endpos arguments.) -- keywords: easy messages: 172695 nosy: gvanrossum priority: normal severity: normal status: open title: Proposal: add re.fullmatch() method type: enhancement versions: Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Tim Peters added the comment: +1. Note that this really can't be done in user-level code. For example, consider matching the pattern a|ab against the string ab Without being _forced_ to consider the ab branch, the regexp will match just the a branch. So, e.g., the example code you posted will say nope, it didn't match (the whole thing). -- nosy: +tim_one ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Serhiy Storchaka added the comment: What will be with non-greedy qualifiers? Should '.*?' full match any string? re.match('.*?$', 'abc').group() 'abc' re.match('.*?', 'abc').group() '' -- nosy: +serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Antoine Pitrou added the comment: Note that this really can't be done in user-level code. Well, how about: def fullmatch(regex, input, flags=0): return re.match((:? + regex + )$, input, flags) -- nosy: +pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16203] Proposal: add re.fullmatch() method
Tim Peters added the comment: Antoine, that's certainly the conceptual intent here. Can't say whether your attempt works in all cases. The docs don't guarantee it. For example, if the original regexp started with (?x), the docs explicitly say the effect of (?x) is undefined if there are non-whitespace characters before the [inline (?x)] flag. Sure, you could parse the regexp is user code too, and move an initial (?...x...) before your non-capturing group. For that matter, you could write your own regexp engine in user code too ;-) The point is that it should be easy for the regexp engine to implement the desired functionality - and user attempts to fake it have pitfalls (even Guido didn't get it right - LOL ;-) ). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16203 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com