Roundup Robot added the comment:
New changeset cc02eca14526 by Nadeem Vawda in branch 'default':
Issue #16034 follow-up: Apply optimizations to the lzma module.
http://hg.python.org/cpython/rev/cc02eca14526
--
___
Python tracker
Nadeem Vawda added the comment:
I've released v0.95 of bz2file, which incorporates all the optimizations
discussed here. The performance should be similar to 2.x's bz2 in most cases.
It is still a lot slower when calling read(10) or read(1), but I hope no-one is
doing that anywhere where
Nadeem Vawda added the comment:
Ah, nice - I didn't think of that optimization. Neater and faster.
I've committed this patch [e6d872b61c57], along with a minor bugfix
[7252f9f95fe6], and another optimization for readline()/readlines()
[6d7bf512e0c3]. [merge with default: a19f47d380d2]
If
Serhiy Storchaka added the comment:
Serhiy, would you be OK with me also including this patch in the bz2file
package?
Yes, of course. We can even speed up 1.5 times the reading of small chunks, if
we inline _check_can_read() and _read_block().
The same approach is applied for LZMAFile.
Nadeem Vawda added the comment:
Yes, of course.
Awesome. I plan to do a new release for this in the next couple of days.
We can even speed up 1.5 times the reading of small chunks, if we inline
_check_can_read() and _read_block().
Interesting idea, but I don't think it's worthwhile. It
Serhiy Storchaka added the comment:
Also, I'm reluctant to have two copies of the code for _read_block(); it
makes the code harder to read, and increases the chance of introducing a bug
when changing the code.
Recursive inline _check_can_read() will be enough. Now this check calls 4
Nadeem Vawda added the comment:
Recursive inline _check_can_read() will be enough. Now this check calls 4
Python functions (_check_can_read(), readable(), _check_non_closed(),
closed). Recursive inlining only readable() in _check_can_read() is achieved
significant but less (about 30%)
Serhiy Storchaka added the comment:
In fact I have tried other code, a bit faster and more maintainable (see patch).
--
Added file: http://bugs.python.org/file27368/bz2_bikeshedding.patch
___
Python tracker rep...@bugs.python.org
Roundup Robot added the comment:
New changeset 1a08f4887cff by Nadeem Vawda in branch '3.3':
Issue #16034: Fix performance regressions in the new BZ2File implementation.
http://hg.python.org/cpython/rev/1a08f4887cff
New changeset cf50a352fe22 by Nadeem Vawda in branch 'default':
Merge #16034:
Nadeem Vawda added the comment:
Thanks for the bug report, Victor, and thank you Serhiy for the patch!
Serhiy, would you be OK with me also including this patch in the bz2file
package?
--
resolution: - fixed
stage: - committed/rejected
status: open - closed
versions: +Python 3.4
Serhiy Storchaka added the comment:
Here is a patch and benchmark script. This required more time than I thought.
Benchmark results:
Unpatched:
5.3 read(1)
0.5 read(10)
0.049 read(100)
0.013 read(1000)
0.009 read(1)
0.0085 read(10)
0.0082 read()
5 read1(1)
0.47
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file27310/bz2_faster_read.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16034
___
Changes by Serhiy Storchaka storch...@gmail.com:
Added file: http://bugs.python.org/file27310/bz2_faster_read.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16034
___
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file27310/bz2_faster_read.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16034
___
Serhiy Storchaka added the comment:
Patch updated. Fixed one error. Now readline() optimized too.
Benchmark results (reading python.bz2):
Py2.7 Py3.2 Py3.3 Py3.3
vanilla patched
4.8 4.8 - 31 read(1)
1 0.943.4e+02 3.6 read(10)
0.610.6
Serhiy Storchaka added the comment:
It looks as bz2 in Python 3.3 has bad buffering. Reading by larger chunks shows
the same speed in 2.7 and 3.3.
--
components: +Library (Lib) -None
nosy: +storchaka
___
Python tracker rep...@bugs.python.org
Serhiy Storchaka added the comment:
Well, I was able to restore performance (using same code as in zipfile). The
patch will be later.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16034
Victor Hooi added the comment:
Hi,
I didn't have any buffering size set before, so I believe it defaults to 0 (no
buffering), right? Wouldn't this be the behaviour on both 2.x and 3.x?
I'm using a 1.5 Mb bzip2 file - I just tried setting buffering to 1000 and
100, and it didn't seem to
Victor Hooi added the comment:
Hi,
Aha, whoops, sorry Serhiy, didn't see your second message - I think you and I
posted our last messages at nearly the same time...
Cool, looking forward to your patch =).
Also, is there any chance you could provide a more detailed explanation of
what's
Serhiy Storchaka added the comment:
Cool, looking forward to your patch =).
It will take some time to make a completed patch. I don't have much time
*right* now. Wait for a few hours.
Also, is there any chance you could provide a more detailed explanation of
what's going on? This is just
Changes by Jesús Cea Avión j...@jcea.es:
--
nosy: +jcea
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16034
___
___
Python-bugs-list mailing list
New submission from Victor Hooi:
Hi,
I was writing a script to parse BZ2 blogfiles under Python 2.6, and I noticed
that bz2file (http://pypi.python.org/pypi/bz2file) seemed to perform much
slower than with bz2 (native):
Changes by Ned Deily n...@acm.org:
--
nosy: +nadeem.vawda
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue16034
___
___
Python-bugs-list mailing
23 matches
Mail list logo