[issue4751] Patch for better thread support in hashlib

2009-05-03 Thread Gregory P. Smith
Gregory P. Smith g...@krypto.org added the comment: Committed with a couple refactorings in trunk r72267. I also added a test (basically checking for corruption that would occur if the locks weren't working). (I'll sort out any py3k vs trunk differences to make future change merges easier).

[issue4751] Patch for better thread support in hashlib

2009-04-07 Thread Lukas Lueg
Lukas Lueg knabberknusperh...@yahoo.de added the comment: bump hashlibopenssl_gil_py27.diff has not yet been applied to py27 and does not apply cleanly any more. Here is an updated version. -- status: pending - open Added file:

[issue4751] Patch for better thread support in hashlib

2009-04-07 Thread Lukas Lueg
Changes by Lukas Lueg knabberknusperh...@yahoo.de: Removed file: http://bugs.python.org/file13057/hashlibopenssl_gil_py27.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-02-12 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: @ebfe: Did you wrote the patch (for python 2.7)? Are you still interrested to write the patch? ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751

[issue4751] Patch for better thread support in hashlib

2009-02-12 Thread Lukas Lueg
Lukas Lueg knabberknusperh...@yahoo.de added the comment: yes, I got lost on that one. I'll create a patch for 2.7 tonight. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-02-12 Thread Lukas Lueg
Lukas Lueg knabberknusperh...@yahoo.de added the comment: Patch for 2.7 Added file: http://bugs.python.org/file13057/hashlibopenssl_gil_py27.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751

[issue4751] Patch for better thread support in hashlib

2009-02-12 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: @ebfe: Your patch is very close to r68411 (patch for py3k), and so it looks correct (I didn't test it). ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751

[issue4751] Patch for better thread support in hashlib

2009-02-12 Thread Collin Winter
Changes by Collin Winter coll...@gmail.com: -- nosy: +collinwinter, jyasskin ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___ ___

[issue4751] Patch for better thread support in hashlib

2009-02-11 Thread Gregory P. Smith
Gregory P. Smith g...@krypto.org added the comment: assigning to me so i don't lose track of making sure this happens for trunk. -- assignee: - gregory.p.smith components: +Extension Modules -Library (Lib) ___ Python tracker rep...@bugs.python.org

[issue4751] Patch for better thread support in hashlib

2009-01-08 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: There is still a potential problem. Figure the following: - thread A executes ENTER_HASHLIB while lock is NULL; therefore, thread A has released the GIL and doesn't hold any lock - thread B enters EVP_update with a large buffer (it can be there,

[issue4751] Patch for better thread support in hashlib

2009-01-08 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Oops, nevermind what I said. The GIL isn't released if obj-lock isn't there. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-08 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Haypo's last patch is ok. If you want it to be in 2.7 too, however, you'll have to provide another patch (I won't do it myself). -- resolution: - accepted versions: -Python 2.7 ___ Python tracker

[issue4751] Patch for better thread support in hashlib

2009-01-08 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Committed to py3k in r68411. Please tell me if you intend to do a patch for 2.7. Otherwise, you/I can close the issue. -- stage: - committed/rejected status: open - pending ___ Python tracker

[issue4751] Patch for better thread support in hashlib

2009-01-08 Thread Lukas Lueg
Lukas Lueg knabberknusperh...@yahoo.de added the comment: I'll do a patch for 2.7 -- versions: +Python 2.7 -Python 3.1 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-06 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: The patch looks fine to me, apart from one point: the return value of PyThread_allocate_lock() should be checked for NULL, and the error either propagated or cleared. (I'd also suggest lowering HASHLIB_GIL_MINSIZE to 2048 or 4196) Gregory,

[issue4751] Patch for better thread support in hashlib

2009-01-06 Thread Gregory P. Smith
Gregory P. Smith g...@krypto.org added the comment: hashlibopenssl_small_lock-4.diff looks good to me. I also agree that HASHLIB_GIL_MINSIZE should be lowered to 2048. Commit it, and please backport it to trunk before closing this bug. -- nosy: -gps versions: +Python 2.7, Python 3.1

[issue4751] Patch for better thread support in hashlib

2009-01-06 Thread Lukas Lueg
Changes by Lukas Lueg knabberknusperh...@yahoo.de: Removed file: http://bugs.python.org/file12587/hashlibopenssl_small_lock-4.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-06 Thread Lukas Lueg
Lukas Lueg knabberknusperh...@yahoo.de added the comment: PyThread_allocate_lock can fail without interference. object-lock will stay NULL and the GIL is simply not released. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751

[issue4751] Patch for better thread support in hashlib

2009-01-04 Thread Lukas Lueg
Changes by Lukas Lueg knabberknusperh...@yahoo.de: Removed file: http://bugs.python.org/file12533/hashopenssl_threads-4.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-04 Thread Lukas Lueg
Lukas Lueg knabberknusperh...@yahoo.de added the comment: I've modified haypo's patch as commented. The object's lock should be free 99.9% of the time so we try non-blocking first and can thereby skip releasing and re-locking the gil (to avoid a deadlock). Added file:

[issue4751] Patch for better thread support in hashlib

2009-01-04 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file12542/hashlibopenssl_small_lock-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-04 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file12554/hashlibopenssl_small_lock-3.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-03 Thread ebfe
Changes by ebfe knabberknusperh...@yahoo.de: Removed file: http://bugs.python.org/file12557/md5module_small_locks.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-03 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: Haypo, we can probably reduce overhead by defining ENTER_HASHLIB like this: #define ENTER_HASHLIB(obj) \ if ((obj)-lock) { \ if (!PyThread_acquire_lock((obj)-lock, 0)) { \ Py_BEGIN_ALLOW_THREADS \

[issue4751] Patch for better thread support in hashlib

2009-01-03 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: I'm not sure about the approach of dynamically allocating self-lock. Imagine you allocate this lock while another thread is between ENTER_HASHLIB and LEAVE_HASHLIB. What happens on LEAVE_HASHLIB? The thread tries to release a lock it hadn't

[issue4751] Patch for better thread support in hashlib

2009-01-03 Thread Lukas Lueg
Lukas Lueg knabberknusperh...@yahoo.de added the comment: The lock is created while having the GIL in EVP_update. No other function releases the GIL (besides the creator-function which does not need the local lock). Thereby no other thread can be in between ENTER and LEAVE while the lock is

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: Releasing the GIL is somewhat expensive and should be avoided if possible. I've moved LEAVE_HASHLIB in EVP_update so the object gets unlocked before we call Py_END_ALLOW_THREADS. This is *only* possible because EVP_update does not use the

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: test-script Added file: http://bugs.python.org/file12534/hashlibtest2.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe
Changes by ebfe knabberknusperh...@yahoo.de: Removed file: http://bugs.python.org/file12461/hashopenssl_threads-3.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: gnarf, actually it should be 'threads.append(Hasher(md))' in the script :-\ ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Releasing the GIL is somewhat expensive and should be avoided if possible. Another possible solution is to create a lockless object by default, and create a lock if the data size is bigger than N (eg. 8 KB). When the lock is

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: I don't think this is actually worth the trouble. You run into situation where one thread might decide that it needs a lock now with other threads being in the to-be-locked-area at that time. ___ Python

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: New implementation of finer lock grain in _hashlibopenssl: only create the lock at the first update with more than 8 KB bytes. Object creation/deallocation is faster if we hash less than 8 KB. Changes between

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file12459/hashopenssl_threads-2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Update small lock patch: replace all tabs by spaces! I forget a change between Python trunk and my patch: there is also the error message for Unicode object. Before: import hashlib; hashlib.md5(abc) TypeError: object

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file12541/hashlibopenssl_small_lock.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread Gregory P. Smith
Gregory P. Smith g...@krypto.org added the comment: First: thanks for doing this. I've had a patch sitting in my own sandbox to release the GIL while hashing for a while but I hadn't finished testing it. It looks pretty similar to what you've done so lets go with the patch being developed in

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: I don't think so. The interface should stay simple - python has very few such magic knobs. People will optimize for their own box as you said - and that code will run worse on all the others... Besides, we've lived so long with

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: haypo, the patch will not compile when WITH_THREADS is not defined. The 'lock'-member in the object structure is not present without WITH_THREADS however the line 'if (self-lock == NULL view.len = HASHLIB_GIL_MINSIZE)' will always refer to it.

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: About HASHLIB_GIL_MINSIZE, I'm unable to mesure the overhead. I tried timeit with 8190 and 8200 but the results are *very* close. I'm running Linux, it's maybe different on other OS. ___ Python

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: Here is another patch, this time for the fallback-md5-module. I know that situations are rare where openssl is not present but threading is. However they might occur out there and the md5module needed some love anyway: - The MD5 class from the

[issue4751] Patch for better thread support in hashlib

2009-01-02 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: ebfe Here is another patch, this time for the fallback-md5-module Please open a separated issue for each module, this issue is already too long and complex ;-) And it would be easier to fix other modules when patches for hashlib

[issue4751] Patch for better thread support in hashlib

2009-01-01 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___ ___ Python-bugs-list mailing list

[issue4751] Patch for better thread support in hashlib

2009-01-01 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Ooooh, I suggested to ebfe to remove the GIL unlock/lock, but I was wrong :-( I hate locks! What is the right fix? Replace ENTER_HASHLIB(self) Py_BEGIN_ALLOW_THREADS ... Py_END_ALLOW_THREADS LEAVE_HASHLIB(self) by

[issue4751] Patch for better thread support in hashlib

2009-01-01 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: The right fix would probably be to define ENTER_HASHLIB(self) as Py_BEGIN_ALLOW_THREADS PyThread_acquire_lock(self-lock) Py_END_ALLOW_THREADS ___ Python tracker rep...@bugs.python.org

[issue4751] Patch for better thread support in hashlib

2008-12-31 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Based on quick testing on my computer, we could probably put the limit as low as 1KB. But it may be that locks are cheap under Linux. In any case, the patch looks good, but I'm no OpenSSL expert. ___ Python

[issue4751] Patch for better thread support in hashlib

2008-12-31 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Actually, your code can deadlock since ENTER_HASHLIB doesn't release the GIL. Think about it: // Thread A is here, holding the GIL and waiting for self-lock to be // released by thread B ENTER_HASHLIB(self) Py_BEGIN_ALLOW_THREADS // Thread B is

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread ebfe
New submission from ebfe knabberknusperh...@yahoo.de: The hashlib functions provided by _hashopenssl.c hold the GIL all the time although the underlying openssl-library is basically thread-safe. I've attached a patch (svn diff) which basically does four things: * If python is compiled with

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I think that you don't use Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS correctly: the GIL can be released when the hashlib lock is acquired (to run hash functions in parallel threads). So the macros should be: #define

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Hi, Very good idea. However, you don't need to discriminate for the bytes type specifically. When a buffer is taken on the object (with PyObject_GetBuffer()), the object is internally locked until the buffer is release with PyBuffer_Release().

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: EVP_copy() and EVP_get_digest_size() should call ENTER_HASHLIB/LEAVE_HASHLIB to protect self-ctx. ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: If view.len is negative, EVP_hash() may read invalid memory :-/ Be careful of integer overflow in this block: Py_ssize_t offset = 0, sublen = len; while (sublen) { unsigned int process = sublen MUNCH_SIZE ? MUNCH_SIZE

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: New version of ebfe's patch: - ENTER/LEAVE_HASHLIB: * don't touch GIL in ENTER_HASHLIB (it's useless) * add mandatory argument (explicit use of self) - EVP_hash(): * restore Py_SAFE_DOWNCAST * simplify the code: always

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread ebfe
Changes by ebfe knabberknusperh...@yahoo.de: Removed file: http://bugs.python.org/file12453/hashopenssl_threads.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: Thanks for the advices. Antoine, maybe you could clarify the situation regarding buffer-locks for me. In older versions of PEP 3118 the PyBUF_LOCK flag was still present but it doesn't seem to have made it's way into the final draft. Is it

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Is it save to assume that a buffer-view will not change until release() is called - for all types supporting the buffer protocol in py3k ?? Yes, it is! ___ Python tracker rep...@bugs.python.org

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I've taken on haypo's patch to release the GIL only when the buffer is larger than 10kb You can factorize the code by moving Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS *into* EVP_hash ;-) 10 KB is a random value or the fast

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: New version of my md5sum.py program limited to 10 threads. New benchmark with 160 files (size in 7..10 MB): - Python unpatched: best=4.8 sec - C version (/usr/bin/md5sum): best=3.6 sec - Python patched: best=2.1 sec As everybody

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file12462/md5sum.py ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue4751 ___

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread ebfe
ebfe knabberknusperh...@yahoo.de added the comment: Here is another simple benchmarker. For me it shows almost perfect scaling (2 cores = 196% performance) if the buffer put into .update() is large enough. I deliberately did not move Py_BEGIN_ALLOW_THREADS into EVP_hash as we might call this

[issue4751] Patch for better thread support in hashlib

2008-12-26 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: hashlibtest.py results on my Quad Core with 4 threads: - unpatched: best=13.0 sec - patched: best=3.25 sec Some maths: 13.0 / 4 = 3.25 \o/ ___ Python tracker rep...@bugs.python.org