[issue42160] unnecessary overhead in tempfile

2020-10-31 Thread Eric Wolf
Eric Wolf added the comment: >>> timeit(os.getpid) 0.0899073329931 Considering the reference leaks, os.getpid() seems to be the better solution. -- ___ Python tracker <https://bugs.python.or

[issue42160] unnecessary overhead in tempfile

2020-10-31 Thread Eric Wolf
Eric Wolf added the comment: It would be possible to allow the GC to finalize the Random instances through weak references. On the other hand, if many _RandomNameSequence instances were used temporarily, a lot of callbacks would be registered via os.register_at_fork(), could that cause

[issue42160] unnecessary overhead in tempfile

2020-10-30 Thread Eric Wolf
Eric Wolf added the comment: Thanks -- ___ Python tracker <https://bugs.python.org/issue42160> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue42160] unnecessary overhead in tempfile

2020-10-26 Thread Eric Wolf
Eric Wolf added the comment: It seems to be insignificant, however it would allow for easier monkey-patching: https://bugs.python.org/issue32276 Instead of changing _Random one could simply assign a new instance to _named_sequence -- ___ Python

[issue42160] unnecessary overhead in tempfile

2020-10-26 Thread Eric Wolf
Eric Wolf added the comment: SystemRandom seems to be slower: from random import Random, SystemRandom from timeit import timeit user = Random() system = SystemRandom() characters = "abcdefghijklmnopqrstuvwxyz0123456789_" timeit(lambda: user.choice(characters)) >>>

[issue42160] unnecessary overhead in tempfile

2020-10-26 Thread Eric Wolf
Change by Eric Wolf : -- keywords: +patch pull_requests: +21911 stage: -> patch review pull_request: https://github.com/python/cpython/pull/22997 ___ Python tracker <https://bugs.python.org/issu

[issue42160] unnecessary overhead in tempfile

2020-10-26 Thread Eric Wolf
New submission from Eric Wolf : The tempfile module contains the class _RandomNameSequence, which has the rng property. This property checks os.getpid() every time and re-initializes a random number generator when it has changed. However, this is only necessary on systems which allow

[issue42160] unnecessary overhead in tempfile

2020-10-26 Thread Eric Wolf
Change by Eric Wolf : -- components: Library (Lib) nosy: Deric-W priority: normal severity: normal status: open title: unnecessary overhead in tempfile type: enhancement versions: Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker

[issue10900] bz2 module fails to uncompress large files

2011-03-01 Thread Eric Wolf
Eric Wolf ebw...@gmail.com added the comment: I tried the change you suggested. It still fails but now at 572,320 bytes instead of 900,000. I'm not sure why the difference in bytes read. I'll explore this more in a bit. I also converted the BZ2 to GZ and used the gzip module. It's failing

[issue10900] bz2 module fails to uncompress large files

2011-03-01 Thread Eric Wolf
Eric Wolf ebw...@gmail.com added the comment: Stupid questions are always worth asking. I did check the MD5 sum earlier and just checked it again (since I copied the file from one machine to another): ebwolf@ubuntu:/opt$ md5sum /host/full-planet-110115-1800.osm.bz2

[issue10900] bz2 module fails to uncompress large files

2011-03-01 Thread Eric Wolf
Eric Wolf ebw...@gmail.com added the comment: The only problem with the theory that the file is corrupt is that at least three people have encountered exactly the same problem with three files: http://mail.python.org/pipermail/tutor/2010-June/076343.html Colin was using an OSM planet file

[issue10900] bz2 module fails to uncompress large files

2011-03-01 Thread Eric Wolf
Eric Wolf ebw...@gmail.com added the comment: I just got confirmation that OSM is using pbzip2 to generate these files. So they are multi-stream. At least that gives a final answer but doesn't solve my problem. I saw this: http://bugs.python.org/issue1625 Does anyone know the current status

[issue10900] bz2 module fails to uncompress large files

2011-02-28 Thread Eric Wolf
Eric Wolf ebw...@gmail.com added the comment: I'm experiencing the same thing. My script works perfectly on a 165MB file but fails after reading 900,000 bytes on a 22GB file. My script uses a buffered bz2file.read and is agnostic about end-of-lines. Opening with rb does not help