On Wed, 30 Mar 2022 at 10:08, Steven D'Aprano <st...@pearwood.info> wrote:
> Here's the version of grab I used:
>
> def grab(text, start, end):
>     a = text.index(start)
>     b = text.index(end, a+len(start))
>     return text[a+len(start):b]
>

This is where Python would benefit from an sscanf-style parser.
Instead of regexps, something this simple could be written like this:

[fruit] = sscanf(sample, "%*sfruit:%s\n")

It's simple left-to-right tokenization, so it's faster than a regex
(due to the lack of backtracking). It's approximately as clear, and
doesn't require playing with the index and remembering to skip
len(start).

That said, though - I do think the OP's task is better served by a
tokenization pass that transforms the string into something easier to
look things up in.

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/U4BFSBS5EAD7QNBMHZXOXYMBQVMNXCFB/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to