On 2022-02-16 22:13, Tim Peters wrote:
[J.B. Langston <jblangs...@datastax.com>]
Well, I certainly sparked a lot of interesting discussion, which I have
quite enjoyed reading. But to bring this thread back around to its
original topic, is there support among the Python maintainers for
adding a timeout feature to the Python re library?
Buried in the fun discussion was my guess: no way. Python's re is
effectively dead legacy code, with no current "owner". Its commit
history shows very little activity for some years already. Mos\
commits are due to generic "cod\e cleanup" crusades that have nothing
specific to do with the algorithms. None required non-triv\ial
knowledge of the implementation.
Here's the most recent I found that actually changed behavior:
"""
commit 6cc8ac949907b9a1c0f73709c6978b7a43e634e3
Author: Zackery Spytz <zsp...@gmail.com>
Date: Fri May 21 14:02:42 2021 -0700
bpo-40736: Improve the error message for re.search() TypeError (GH-23312)
Include the invalid type in the error message.
""""
A trivial change.
I will look at the third-party regex library that Jonathan suggested but
I still believe a timeout option would be a valuable feature to have
in the standard library.
Which is the problem: regex has _dozens_ of features that would be
valuable to have in the standard library. reg\ex is in fact one of the
best regexp libraries on the planet. It already has timeouts, and
other features (like possessive quantifiers) that are actually (unlike
timeouts) frequently asked about by many programmers.
In fact regex started life intending to go into core Python, in 2008:
https://bugs.python.org/issue3825
That stretched on and on, and the endless bikeshedding eventually
appeared to fizzle out in 2014:
https://bugs.python.org/issue2636
In 2021 a core dev eventually rejected it, as by then MRAB had long
since released it as a successful extension module. I assume - but
don't know - he got burned out by "the endless bikeshedding" on those
issue reports.
I eventually decided against having it added to the standard library
because that would tie fixes and additions to Python's release cycle,
and there's that adage that Python has "batteries included", but not
nuclear reactors. PyPI is a better place for it, for those who need more
than what the standard re module provides.
In any cose, no, no core dev I know of is going to devote their
limited time to reproducing a tiny subset of regex's many improvements
in Python's legacy engine. In fact, "install regex!" is such an
obvious choice at this point that I wouldn't even give time to just
reviewing a patch that added timeouts.
BTW, I didn't mention regex in your BPO report because I didn't know
at the time it already implemented timeouts. I learned that in this
thread.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/5K3XWIY7YK4RUMIZGYWNETB3N74PTLPZ/
Code of Conduct: http://python.org/psf/codeofconduct/