On 2017-10-28 22:05, Guido van Rossum wrote:
On Sat, Oct 28, 2017 at 12:09 AM, Nick Coghlan <ncogh...@gmail.com
<mailto:ncogh...@gmail.com>> wrote:
On 28 October 2017 at 01:57, Guido van Rossum <gu...@python.org
<mailto:gu...@python.org>> wrote:
Oh. Yes, that is being discussed about once a year two. It
seems Matthew isn't very interested in helping out with the
port, and there are some concerns about backwards
compatibility with the `re` module. I think it needs a champion!
Matthew's been amenable to the idea when it comes up, and he
explicitly wrote the module to be usable as a drop-in replacement
for "re" (hence the re-compatible v0 behaviour still being the
default).
The resistance has more been from our side, since this is a case
where existing regex module users are clearly better off if it
remains a separate project, as that keeps upgrades independent of
the relatively slow standard library release cycle (and allows it
to be used on Python 2.7 as well as in 3.x). By contrast, the
potential benefits of standard library inclusion accrue primarily
to Python newcomers and folks writing scripts without the benefit
of package management tools, since they'll have a more capable
regex engine available as part of the assumed language baseline.
That means that if we add regex to the standard library in the
regular way, there's a more than fair chance that we'll end up
with an outcome like the json vs simplejson split, where we have
one variant in the standard library, and another variant on PyPI,
and the variants may drift apart over time if their maintenance is
being handled by different people. (Note: one may argue that we
already have this split in the form of re vs regex. So if regex
was brought in specifically to replace _sre as the re module
implementation, rather than as a new public API, then we at least
wouldn't be making anything *worse* from a future behavioural
consistency perspective, but we'd be risking a compatibility break
for anyone depending on _sre and other internal implementation
details of the re module).
One potential alternative approach that is then brought up (often
by me) is to suggest instead *bundling* the regex module with
CPython, without actually bringing it fully within the regular
standard library maintenance process. The idea there would be to
both make the module available by default in python.org
<http://python.org> downloads, *and* make it clear to
redistributors that the module is part of the expected baseline of
Python functionality, but otherwise keep it entirely in its
current independently upgradable form.
That would still be hard (since it would involve establishing new
maintenance policy precedents that go beyond the current
special-casing of `pip` in order to bootstrap PyPI access), but
would have the additional benefit of paving the way for doing
similar things with other modules where we'd like them to be part
of the assumed baseline for end users, but also have reasons for
wanting to avoid tightly coupling them to the standard libary's
regular maintenance policy (most notably, requests).
And that's where discussions tend to fizzle out:
* outright replacement of the current re module implementation
with a private copy of the regex module introduces compatibility
risks that would need a fiat decision from you as BDFL to say
"Let's do it anyway, make sure the test suite still works, and
then figure out how to cope with any other consequences as they arise"
* going down the bundling path requires making some explicit
community management decisions around what we actually want the
standard library to *be* (and whether or not there's a difference
between "the standard library" and "the assumed available package
set" for Python installations that are expected to run arbitrary
third party scripts rather than specific applications)
* having both the current re API and implementation *and* a new
regex based API and implementation in the standard library
indefinitely seems like it would be a maintainability nightmare
that delivered the worst of all possible outcomes for everyone
involved (CPython maintainers, regex maintainers, Python end users)
Maybe it would be easier if Matthew were amenable to maintaining the
stdlib version and only add new features to the PyPI version when
they've also been added to the stdlib version. IOW if he were
committed to *not* letting the [simple]json thing happen.
I don't condone having two different regex implementations/APIs
bundled in any form, even if one were to be deprecated -- we'd never
get rid of the deprecated one until 4.0. (FWIW I don't condone this
pattern for other packages/modules either.) Note that even if we
outright switched there would *still* be two versions, because regex
itself has an internal versioning scheme where V0 claims to be
strictly compatible with re and V1 explicitly changes the matching
rules in some cases. (I don't know if this means that you have to
request V1 to use \G though.)
The other problem with outright replacement is that despite Matthew's
best efforts there may be subtle incompatibilities that will break
people's code in surprising ways. I don't recall much about our
current 're' test suite -- I'm sure it tests every feature, but I'm
not sure how far it goes in testing edge cases. IIRC this is where in
the past we've always erred on the side of (extreme) caution, and my
recollection is of Matthew being (understandably!) pretty lukewarm
about doing extra work to help assess this -- IIRC he's totally fine
with the status quo.
If there's new information or a change in Matthew's outlook I'd be
happy to reconsider it.
At one time I was in favour of including it in the stdlib, but then I
changed my mind. Having it outside gives me more flexibility, and I'm
happy with just using pip.
Not that I'm planning on making any further additions, just bug fixes
and updates to follow the Unicode updates. I think I've crammed enough
into it already. There's only so much you can do with the regex syntax
with its handful of metacharacters and possible escape sequences...
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com