[issue43785] bz2 performance issue.

2021-04-09 Thread Ma Lin


Change by Ma Lin :


--
nosy: +malin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43785] bz2 performance issue.

2021-04-09 Thread Karthikeyan Singaravelan


Change by Karthikeyan Singaravelan :


--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43785] bz2 performance issue.

2021-04-08 Thread Inada Naoki


Change by Inada Naoki :


--
keywords: +patch
pull_requests: +24031
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/25299

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43785] bz2 performance issue.

2021-04-08 Thread Inada Naoki


Change by Inada Naoki :


--
type:  -> performance
Added file: https://bugs.python.org/file49949/create.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43785] bz2 performance issue.

2021-04-08 Thread Inada Naoki


New submission from Inada Naoki :

The original issue is reported here.
https://discuss.python.org/t/non-optimal-bz2-reading-speed/6869

1. Only BZ2File uses RLock()

lzma and gzip don't use RLock(). It adds significant performance overhead.
When I removed `with self._lock:`, decompression speed improved from about 148k 
line/sec to 200k line/sec.


2. The default __iter__ calls `readline()` for each iteration.

BZ2File.readline() is implemented in C so it is slightly slow than C 
implementation.

If I add this `__iter__()` to BZ2File, decompression speed improved from about 
148k lines/sec (or 200k lines/sec) to 500k lines/sec.

def __iter__(self):
self._check_can_read()
return iter(self._buffer)

If this __iter__ method is safe, it can be added to gzip and lzma too.

--
components: Library (Lib)
files: dec.py
messages: 390588
nosy: methane
priority: normal
severity: normal
status: open
title: bz2 performance issue.
versions: Python 3.10
Added file: https://bugs.python.org/file49948/dec.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com