On 2017-10-29 12:27, Serhiy Storchaka wrote:
27.10.17 18:35, Guido van Rossum пише:
The "why" question is not very interesting -- it probably wasn't in PCRE and nobody was familiar with it when we moved off PCRE (maybe it wasn't even in Perl at the time -- it was ~15 years ago).

I didn't understand your description of \G so I googled it and found a helpful StackOverflow article: https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. From this I understand that when using e.g. findall() it forces successive matches to be adjacent.

This looks too Perlish to me. In Perl regular expressions are the part
of language syntax, they can contain even Perl expressions. Arguments to
them are passed implicitly (as well as to Perl's analogs of str.strip()
and str.split()) and results are saved in global special variables.
Loops also can be implicit.

It seems to me that \G makes sense only to re.findall() and
re.finditer(), not to re.match(), re.search() or re.split().

In Python all this is explicit. Compiled regular expressions are
objects, and you can pass start and end positions to Pattern.match().
The Python equivalent of \G looks to me like:

p = re.compile(...)
i = 0
while True:
      m = p.match(s, i)
      if not m: break
      ...
      i = m.end()


You're correct. \G matches at the start position, so .search(r\G\w+') behaves the same as .match(r'\w+').

findall and finditer perform a series of searches, but with \G at the start they'll perform a series of matches, each anchored at where the previous one ended.

The one also can use the undocumented Pattern.scanner() method. Actually
Pattern.finditer() is implemented as iter(Pattern.scanner().search).
iter(Pattern.scanner().match) would return an iterator of adjacent matches.

I think it would be more Pythonic (and much easier) to add a boolean
parameter to finditer() and findall() than introduce a \G operator.

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to