[issue1675951] Performance for small reads and fix seek problem

2010-09-22 Thread Florian Festi

Florian Festi florianfe...@users.sourceforge.net added the comment:

Stupid me! I ran the tests against my systems gzip version (Py 3.1). The 
performance issue is basically fixed by rev 77289. Performance is even a bit 
better that my original patch by may be 10-20%. The only test case where it 
performs worse is 

Random 10485760 byte block test
Original gzip Write:   20.452 s Read:2.931 s
New gzip  Write:   20.518 s Read:1.247 s

Don't know if it is worth bothering. May be increasing the maximum chunk size 
improves this - but I didn't try that out yet.

WRT to seeking:

I now have two patches that eliminate the need for seek() on normal operation 
(rewind obviously still needs seek()). Both are based on the PaddedFile class. 
The first patch just creates a PaddedFile object while switching from an old to 
a new member while the second just wraps the fileobj all the time. Performance 
test show that wrapping is cheap. The first patch is a bit ugly while the 
second requires a implementation of seek() and may create problems if new 
methods of the fileobj are used that may interfere with the PaddedFile's 
internals.

So I leave the choice which one is preferred to the module owner.

The patch creates another problem with is not yet fixed: The implementation of 
.seekable() is becoming wrong. As one can now use non seekable files the 
implementation should check if the file object used for reading is really 
seekable. As this is my first PY3k work I'd prefer if this can be solved by 
someone else (But that should be pretty easy).

--
Added file: 
http://bugs.python.org/file18964/0001-Avoid-the-need-of-seek-ing-on-the-file-read.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1675951
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1675951] Performance for small reads and fix seek problem

2010-09-22 Thread Florian Festi

Changes by Florian Festi florianfe...@users.sourceforge.net:


Added file: 
http://bugs.python.org/file18965/0002-Avoid-the-need-of-seek-ing-on-the-file-read-2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1675951
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1675951] [gzip] Performance for small reads and fix seek problem

2010-09-18 Thread Florian Festi

Florian Festi florianfe...@users.sourceforge.net added the comment:

I updated the performace script to Py3. You still need to change the import 
gzipnew line to actually load the modified module. Right now it just compares 
the stdlib gzip module to itself.

--
Added file: http://bugs.python.org/file18916/test_gzip2-py3.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1675951
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1675951] [gzip] Performance for small reads and fix seek problem

2010-09-18 Thread Florian Festi

Florian Festi florianfe...@users.sourceforge.net added the comment:

Attached result of a run with stdlib gzip module only. Results indicate that 
performance still is as bad as on Python 2. The Python 3 gzip module also still 
makes use of tell() ans seek(). So both argument for including this patch are 
still valid.

Porting the patch will include some real work to get the bytes vs string split 
right.

--
Added file: http://bugs.python.org/file18917/result-py3.txt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1675951
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue1675951] [gzip] Performance for small reads and fix seek problem

2010-06-17 Thread Florian Festi

Florian Festi florianfe...@users.sourceforge.net added the comment:

There are no compatibility concerns I am aware of. The new implementation does 
no longer need a seek()able file. Of course an implemented seek() method won't 
hurt anyone. The additional tests are only there to point out the problems of 
the old implementation.

So there is no flag needed to maintain compatibility. The patch just has to be 
reviewed and then to be applied. If there are any concerns or questions I'll be 
glad to assist.

Florian

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1675951
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com