Regarding to the performance difference between "re" and "regex" and packaging
related options, we did a performance comparison using Python 3.6.0 to run some
micro-benchmarks in the Python Benchmark Suite
(https://github.com/python/performance):
Results in ms, and the lower the better (running on Ubuntu 15.10)
re regex (via pip install regex,
and a replacement of "import re" with "import regex as re")
bm_regex_compile.py 229 298
bm_regex_dna.py 171 267
bm_regex_effbot.py 2.77 3.04
bm_regex_v8.py 24.8 14.1
This data shows "re" is better than "regex" in term of performance in 3 out of
4 above micro-benchmarks.
Anyone searching for "regular expression python" will get a first hit at the
Python documentation on "re". Naturally, any new developer could start with
"re" since day 1 and not bother to look elsewhere for alternatives later on.
We did a query for "import re" against the big cloud computing software
application, OpenStack (with 3.7 million lines of source codes and majority of
them written in Python), and got ~1000 hits.
With that being said, IMHO, it would be nice to capture ("borrow") the
performance benefit from "regex" and merged into "re", without knowing or
worrying about packaging/installing stuff.
Cheers,
Peter
-----Original Message-----
From: Python-Dev
[mailto:[email protected]] On Behalf Of
Nick Coghlan
Sent: Tuesday, January 31, 2017 1:54 AM
To: Barry Warsaw <[email protected]>
Cc: [email protected]
Subject: Re: [Python-Dev] re performance
On 30 January 2017 at 15:26, Barry Warsaw <[email protected]> wrote:
> On Jan 30, 2017, at 12:38 PM, Nick Coghlan wrote:
>
>>I think there are 3 main candidates that could fit that bill:
>>
>>- requests
>>- setuptools
>>- regex
>
> Actually, I think pkg_resources would make an excellent candidate.
> The setuptools crew is working on a branch that would allow for
> setuptools and pkg_resources to be split, which would be great for
> other reasons. Splitting them may mean that pkg_resources could
> eventually be added to the stdlib, but as an intermediate step, it
> could also test out this idea. It probably has a lot less of the baggage
> that you outline.
Yep, if/when pkg_resources is successfully split out from the rest of
setuptools, I agree it would also be a good candidate for stdlib bundling -
version independent runtime access to the database of installed packages is a
key capability for many use cases, and not currently something we support
especially well.
It's also far more analogous to the existing pip bundling, since
setuptools/pkg_resources are also maintained under the PyPA structure.
Cheers,
Nick.
--
Nick Coghlan | [email protected] | Brisbane, Australia
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/peter.xihong.wang%40intel.com
_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com