[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-26 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

The reason of using bytes concatenating rather than accumulating in the list, 
is that in most cases one of arguments is an empty bytes object (appending to 
the empty buffer or uncompressing a file with large compression block), and 
this case is optimized in CPython. In mos cases there is at most one nontrivial 
bytes concatenation per read operation, and using b''.join() is slower in that 
case.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-26 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-26 Thread Shubha Ramani

Shubha Ramani added the comment:

Upon comparing the PyPy changes from attached diff to the latest cpython3 
github, I don't find a need for improvement. Looks like cpython3 zipfile.py has 
the same changes and the read() method in class 
ZipExtFile(io.BufferedIOBase) is vastly improved compared to what pypy had. In 
fact it has been completely re-written.

The CPython 2.7 version of this bug was promptly closed as a duplicate:
http://bugs.python.org/issue30467

Perhaps the changes from CPython3 are not being backported to CPython2 ?

This bug can be closed because CPython3 has no issues.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I'm not the initial author of zipfile. That initial code for decrypting  was 
added in issue698833.

I like your idea about replacing the generating code with the precomputed 
table. Please open a separate issue for this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-25 Thread Alecsandru Patrascu

Alecsandru Patrascu added the comment:

Serhiy, I am curious why did you have chosen to compute the CRC32 table 
everytime? It is standard (the generator polynomial does not change) and always 
will output the same values. And it is also less computational intensive to 
loading a precomputed array vs calculating it each time a zip archive is 
extracted.

--
nosy: +alecsandru.patrascu

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-25 Thread Shubha Ramani

Shubha Ramani added the comment:

Serhiy yes what you said makes sense. Thanks for clarifying. Updates 
(benchmarking results) shortly...stay tuned.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I assigned this issue to me as the maintainer of the zipfile module. This means 
that I undertake to examine the results of benchmarking, make the review of the 
proposed patch, and merge it if it improves the performance and doesn't have 
negative side effects. This shouldn't stop you from writing your patch.

Actually there was a reason why the code is written in that way. But maybe that 
reason no longer actual or there are stronger arguments for other ways. It is 
hard to say until I reproduce benchmarking results.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-25 Thread Shubha Ramani

Shubha Ramani added the comment:

serhiy sure I will attach proof of the performance bottle-neck on 2.7 and 3.7 
before I submit a patch. Please assign this bug to me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-25 Thread Shubha Ramani

Shubha Ramani added the comment:

Please assign this bug to me. I will submit a patch

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Could you provide a benchmarkin script that demonstrates a performance 
bottleneck?

The patch is against 2.7. Please provide a patch against the master branch.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-25 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
assignee:  -> serhiy.storchaka
components: +Library (Lib)
nosy: +alanmcintyre, serhiy.storchaka, twouters

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30468] Propagate zipfile.py pypy issue #905 patch to CPython 3.7

2017-05-24 Thread Shubha Ramani

New submission from Shubha Ramani:

PyPy had a longstanding issue :
ZipFile.extractall is very slow compared to CPython 2.6

https://bitbucket.org/pypy/pypy/issues/905/zipfileextractall-is-very-slow-compared-to

which has been fixed in the PyPy code base. The changes were entirely in 
zipfile.py (see the attached patch for PyPy)

The patch fixed a significant performance bottleneck in PyPy.

--
files: issue905.diff
keywords: patch
messages: 294413
nosy: shubhar
priority: normal
severity: normal
status: open
title: Propagate zipfile.py pypy issue #905 patch to CPython 3.7
type: performance
versions: Python 3.7
Added file: http://bugs.python.org/file46899/issue905.diff

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com