Ruben Vorderman <r.h.p.vorder...@lumc.nl> added the comment:

> Within the stdlib, I'd focus only on using things that can be used in a 100% 
> api compatible way with the existing modules.

> Otherwise creating a new module and putting it up on PyPI to expose the 
> functionality from the libraries you want makes sense and will be easier to 
> make available to everyone on existing Python versions rather than waiting on 
> getting something into CPython.

Agreed. 100% backwards compatibility is more important than speed. And getting 
it available to users is faster as a module than implementing it in CPython. 

I already have a PR open in the xopen module to implement the use of the igzip 
program. Implementing a module like the gzip module in CPython will be more 
work, but I will certainly consider it. Thanks for the suggestion! 

> There is a caveat to using any of these: how well maintained and security 
> vetted are all of the code paths in the implementation?  zlib proper gets 
> massive security attention.  Its low rate of change and staleness are a 
> feature.

I didn't consider that. So I looked at the CVE page for ZLIB. The latest issues 
are from 2017. Before that 2005. This is the 2017 report: 
https://wiki.mozilla.org/images/0/09/Zlib-report.pdf. 
Note how it states that the old compiler support etc. are a basis for 
vulnerabilities. Precisely zlib-ng did get rid of these parts. On the other 
hand, Mozilla notes that Zlib is a great candidate for periodic security 
audits, precisely for the same reasons you mention.

> FWIW I tend to avoid software provided by Intel given any other choice.

I know the feeling. They rightfully have a very bad reputation for things they 
did in the past. But this particular code is open source and compilable with 
open source compilers. Part of it is written in Assembly, to get the speed 
advantage. I benchmarked it on my AMD processor and I too get enormous speed 
advantages out of it.

>  even less about Intels self serving arm-ignoring oddity.

They *do* provide instructions to build for arm. Right on their README. 
https://github.com/intel/isa-l#building-isa-l. I think it is very unfair to be 
so dismissive just because Intel pays the developers. This is good work, which 
speeds up bioinformatics workloads, which in turn helps us to help more 
patients.

On the whole I think the arguments to make a module are very strong. So I think 
that is the appropriate way forward. I'd love everyone to switch to more 
efficient deflate algorithms, but CPython may not be the right project to drive 
this change. At least this discussion is now here as a reference for other 
people who are curious about improving this performance aspect.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue41566>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to