[issue10030] Patch for zip decryption speedup

2017-03-30 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2017-03-30 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:


New changeset 06e522521c06671b4559eecf9e2a185c2d62c141 by Serhiy Storchaka in 
branch 'master':
bpo-10030: Sped up reading encrypted ZIP files by 2 times. (#550)
https://github.com/python/cpython/commit/06e522521c06671b4559eecf9e2a185c2d62c141


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2017-03-25 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Brian, you requested tests, but PR 550 doesn't add new API and doesn't change 
the behavior of public API. No new tests are needed.

--
nosy: +brian.curtin
versions: +Python 3.7 -Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2017-03-07 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
pull_requests: +452

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

I quote from Gregory P. Smith (msg91897):


The decryption provided by the zipfile module is for the worthless
32-bit crc based encryption of zipfiles.  I think promoting the use of
that is a bad idea.

zipfile can be used by people to get their data out of such files.  We
should not encourage them to put it and/or their code into such a stupid
format.


I think that the effort required for speedup of this almost useless feature is 
excessive.

If someone want to implement the strong encryption for zip files - welcome.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

See also criticism in the original discussion: 
http://mail.python.org/pipermail/python-dev/2009-August/091450.html .

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread Robert de Vries

Robert de Vries added the comment:

My use case is decrypting files of 100's of megabytes. This is so slow that it 
is quite useless. About an hour or so.

I do agree that the encryption is worthless, but that is not important for my 
use case where I want to discourage people from reverse engineering the 
contents.
If it is so dangerous as some people have pointed out, it should be removed at 
the cost of not supporting a standard feature of ZIP files.
In my opinion you either support a feature and you support it good (efficient) 
or you don't. As it stands now, users will be disappointed in using a supported 
feature.

Some people argue that adding C code to Python is dangerous as it will lead to 
bugs, vulnerabilities etc.
You could dismiss every addition with C code to Python with this argument, so 
there must be some positive aspects to outweigh the negative side. The negative 
side is fairly small (50 lines of very simple C code), plus some standard 
Python glue code.
The benefit is a 100 fold increase in performance and the removal of 1 line of 
documentation telling that this feature is extremely slow.
(patch attached)

--
Added file: http://bugs.python.org/file27881/doc.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread R. David Murray

R. David Murray added the comment:

We aren't particularly interested in helping people make their files slightly 
harder to reverse engineer, either, so I don't think that is a good enough 
reason for accepting this.  There might be other reasons that are good enough, 
but I don't think that is one of them.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread Christian Heimes

Christian Heimes added the comment:

From the zlib FAQ:

38. How can I encrypt/decrypt zip files with zlib?

zlib doesn't support encryption. The original PKZIP encryption is very weak
and can be broken with freely available programs. To get strong encryption,
use GnuPG, http://www.gnupg.org/ , which already includes zlib compression.
For PKZIP compatible encryption, look at http://www.info-zip.org/

I don't see the point of a weak and easily breakable encryption.

--
nosy: +christian.heimes

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread Robert de Vries

Robert de Vries added the comment:

If the encryption is so horrible why is there any support (with bad 
performance) at all in Python?
It would be better to remove it altogether.
This prevents users from building software using this feature only to find out 
later how bad the performance is. (This is the main reason why I have submitted 
this patch.)
If the support had not been in Python I would not have used this feature to 
begin with.

To reiterate my previous point. Either support something and do it well, or 
don't. Don't do a half job.

Please note that there are more attempts to fix this, so I am not completely 
alone here.
http://pypi.python.org/pypi/czipfile

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 If the encryption is so horrible why is there any support (with bad
 performance) at all in Python?
 It would be better to remove it altogether.

We don't remove it as it would break existing programs which rely on
this feature. However adding a bunch of C code to improve performance of
such a questionable feature is controversial.

I wouldn't be against the acceleration myself, however I am not
interested in reviewing or accepting it, sorry.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch which optimize (speed up 2x) Python implementation of ZIP 
decryptor.  It is almost the maximum of what can be achieved without 
significant degradation of maintainability.

Of course, 2x is less then 100x, but it more portable and costs almost nothing. 
 If that's not enough, I suggest to use an external unzip. You can even run 
multiple unzips at a time for the parallel extraction of multiple files.

If in the future someone will implement the strong encryption for ZIP files, it 
is possible it will required a C accelerator module and it is possible there 
will be a place for PKWARE's traditional encryption.

--
Added file: http://bugs.python.org/file27882/zipfile_decryptor_speedup.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread R. David Murray

R. David Murray added the comment:

 If the encryption is so horrible why is there any support (with bad 
 performance) at all in Python?

I would say it there so that people can use python to decrypt an encrypted 
zip archive they have been sent that was generated by some other tool.  I would 
say this is not a particularly strong use case since there are other tools that 
can be used for this, but as Antoine said removing the feature could break 
existing working code, so we are unlikely to do it.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-04 Thread Robert de Vries

Robert de Vries added the comment:

The current situation is now that the decryption is part of Python. It is well 
known to be computationally intensive and should therefore be implemented in C.

This patch provides that support.

The discussion if Python should support the decryption is behind us, as the 
support is already in.

The only discussion should be about if there are enough users wanting this 
performance improvement to add it.

From the download statistics of czipfile I would say that there are roughly 
2500 users interested.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-03 Thread Robert de Vries

Robert de Vries added the comment:

Attached you will find the updated patch for the python 3 tree as of now.

I have measured a speed-up of more than a factor 100.

--
nosy: +rhdv
Added file: http://bugs.python.org/file27867/zipdecrypt-3.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-03 Thread Robert de Vries

Robert de Vries added the comment:

Patch for python 2.7

Same patch as for python 3, backported to python 2.7

Tested on Linux only.

--
Added file: http://bugs.python.org/file27868/zipdecrypt-2.7.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2012-11-03 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
nosy: +serhiy.storchaka
versions: +Python 3.4 -Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-11-04 Thread Shashank

Shashank shashank.sunny.si...@gmail.com added the comment:

I had uploaded an incorrect patch. New corrected patch against trunk (on Mac OS 
uploaded).

--
Added file: http://bugs.python.org/file19494/zipdecrypt.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-11-04 Thread Shashank

Changes by Shashank shashank.sunny.si...@gmail.com:


Removed file: http://bugs.python.org/file19494/zipdecrypt.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-11-04 Thread Shashank

Changes by Shashank shashank.sunny.si...@gmail.com:


Added file: http://bugs.python.org/file19495/zipdecrypt.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-10-23 Thread Shashank

Shashank shashank.sunny.si...@gmail.com added the comment:

the C module should be private and therefore called _zipdecrypt
done

if you want to avoid API mismatch, you could give a tp_call to your C 
decrypter object, rather than a decrypt method
done

- you can put all initialization code in zipdecrypt_new and avoid the need 
for zipdecrypt_init
keeping this similar to the existing _ZipDecrypter class in ZipFile (all 
initialization in init rather than new), which was probably to allow re-init 
and re-use of one instance

it's better to use the y* code in PyArg_ParseTuple, rather than s#
y* does not seem to be available in 2.7, using s* instead

you should define your module as PY_SSIZE_T_CLEAN and use Py_ssize_t as 
length variables (rather than int)
done

you *mustn't* change the contents of the buffer which is given you by s# or 
y*, since that buffer is read-only (it can be a bytes object); instead, 
create a new bytes object using PyBytes_FromStringAndSize(NULL, length) and 
write into that; or, if you want a read-write buffer, use the w* code
corrected, not altering the input buffer, reading input buffer as s*

--
Added file: http://bugs.python.org/file19347/zipdecrypt.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-10-12 Thread Shashank

Shashank shashank.sunny.si...@gmail.com added the comment:

Attached is a patch with changes in Lib/test/test_zipfile.py to test both C and 
pure-py impls (on systems where the C impl is present).

Admittedly, this approach to emulating the absence of C impl is a bit hacky. 
This is primarily because the changed class is not a part of the public API and 
hence not being tested directly.

David, could you verify that the approach is ok?

--
Added file: http://bugs.python.org/file19196/zipdecrypt.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-10-12 Thread Shashank

Changes by Shashank shashank.sunny.si...@gmail.com:


Added file: http://bugs.python.org/file19197/zipdecrypt.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-10-12 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Hello,

Some quick comments:
- the C module should be private and therefore called _zipdecrypt
- if you want to avoid API mismatch, you could give a tp_call to your C 
decrypter object, rather than a decrypt method
- you can put all initialization code in zipdecrypt_new and avoid the need for 
zipdecrypt_init
- it's better to use the y* code in PyArg_ParseTuple, rather than s#
- you should define your module as PY_SSIZE_T_CLEAN and use Py_ssize_t as 
length variables (rather than int)
- you *mustn't* change the contents of the buffer which is given you by s# or 
y*, since that buffer is read-only (it can be a bytes object); instead, 
create a new bytes object using PyBytes_FromStringAndSize(NULL, length) and 
write into that; or, if you want a read-write buffer, use the w* code

--
nosy: +pitrou
stage:  - patch review
versions: +Python 3.2 -Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-10-07 Thread Shashank

Shashank shashank.sunny.si...@gmail.com added the comment:

I have updated the patch with a check for the availability of C impl and to use 
pure-py impl as a fallback.

How do you suggest would the tests change? As I had mentioned before, in my 
understanding since there is no change in the API the already existing tests 
should work.

One can simulate the absence of C impl in a system where it is present but 
AFAIU this is not what it is usually done (e.g, in the case of optional zlib 
dependency in the same module)

--
Added file: http://bugs.python.org/file19149/zipdecrypt.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-10-07 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

It is what is normally done *now* when there is both a C and a python 
implementation (see, for example, test_datetime.py and test_io.py for two 
different approaches to that).  Not all tests have been updated to this 
practice.

Thanks for working on this.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-10-05 Thread Shashank

New submission from Shashank shashank.sunny.si...@gmail.com:

As promised in this thread 
http://mail.python.org/pipermail/python-dev/2009-August/091450.html (a year 
ago!), attached is a patch that replaces simple zip decryption logic written in 
pure python with that in C.

As reported in the link above, this can result in speedups up to couple of 
orders of magnitude.

There doesn't seem to be any need to add any new tests as this patch doesn't 
change any public API

--
components: Library (Lib)
files: zipdecrypt.patch
keywords: patch
messages: 118030
nosy: shashank
priority: normal
severity: normal
status: open
title: Patch for zip decryption speedup
type: performance
versions: Python 2.7
Added file: http://bugs.python.org/file19135/zipdecrypt.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10030] Patch for zip decryption speedup

2010-10-05 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

It would be nice to retain the pure python version as a fallback for non 
CPython implementations, that will require tweaking the tests to make sure both 
are tested.

--
nosy: +r.david.murray

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10030
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com