Gregory P. Smith g...@krypto.org added the comment:
Committed with a couple refactorings in trunk r72267. I also added a
test (basically checking for corruption that would occur if the locks
weren't working).
(I'll sort out any py3k vs trunk differences to make future change
merges easier).
Lukas Lueg knabberknusperh...@yahoo.de added the comment:
bump
hashlibopenssl_gil_py27.diff has not yet been applied to py27 and does
not apply cleanly any more. Here is an updated version.
--
status: pending - open
Added file:
Changes by Lukas Lueg knabberknusperh...@yahoo.de:
Removed file: http://bugs.python.org/file13057/hashlibopenssl_gil_py27.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
@ebfe: Did you wrote the patch (for python 2.7)? Are you still
interrested to write the patch?
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
Lukas Lueg knabberknusperh...@yahoo.de added the comment:
yes, I got lost on that one. I'll create a patch for 2.7 tonight.
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Lukas Lueg knabberknusperh...@yahoo.de added the comment:
Patch for 2.7
Added file: http://bugs.python.org/file13057/hashlibopenssl_gil_py27.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
STINNER Victor victor.stin...@haypocalc.com added the comment:
@ebfe: Your patch is very close to r68411 (patch for py3k), and so it
looks correct (I didn't test it).
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
Changes by Collin Winter coll...@gmail.com:
--
nosy: +collinwinter, jyasskin
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Gregory P. Smith g...@krypto.org added the comment:
assigning to me so i don't lose track of making sure this happens for
trunk.
--
assignee: - gregory.p.smith
components: +Extension Modules -Library (Lib)
___
Python tracker rep...@bugs.python.org
Antoine Pitrou pit...@free.fr added the comment:
There is still a potential problem.
Figure the following:
- thread A executes ENTER_HASHLIB while lock is NULL; therefore, thread
A has released the GIL and doesn't hold any lock
- thread B enters EVP_update with a large buffer (it can be there,
Antoine Pitrou pit...@free.fr added the comment:
Oops, nevermind what I said. The GIL isn't released if obj-lock isn't
there.
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Antoine Pitrou pit...@free.fr added the comment:
Haypo's last patch is ok. If you want it to be in 2.7 too, however,
you'll have to provide another patch (I won't do it myself).
--
resolution: - accepted
versions: -Python 2.7
___
Python tracker
Antoine Pitrou pit...@free.fr added the comment:
Committed to py3k in r68411. Please tell me if you intend to do a patch
for 2.7. Otherwise, you/I can close the issue.
--
stage: - committed/rejected
status: open - pending
___
Python tracker
Lukas Lueg knabberknusperh...@yahoo.de added the comment:
I'll do a patch for 2.7
--
versions: +Python 2.7 -Python 3.1
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Antoine Pitrou pit...@free.fr added the comment:
The patch looks fine to me, apart from one point: the return value of
PyThread_allocate_lock() should be checked for NULL, and the error
either propagated or cleared.
(I'd also suggest lowering HASHLIB_GIL_MINSIZE to 2048 or 4196)
Gregory,
Gregory P. Smith g...@krypto.org added the comment:
hashlibopenssl_small_lock-4.diff looks good to me.
I also agree that HASHLIB_GIL_MINSIZE should be lowered to 2048.
Commit it, and please backport it to trunk before closing this bug.
--
nosy: -gps
versions: +Python 2.7, Python 3.1
Changes by Lukas Lueg knabberknusperh...@yahoo.de:
Removed file: http://bugs.python.org/file12587/hashlibopenssl_small_lock-4.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Lukas Lueg knabberknusperh...@yahoo.de added the comment:
PyThread_allocate_lock can fail without interference. object-lock will
stay NULL and the GIL is simply not released.
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
Changes by Lukas Lueg knabberknusperh...@yahoo.de:
Removed file: http://bugs.python.org/file12533/hashopenssl_threads-4.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Lukas Lueg knabberknusperh...@yahoo.de added the comment:
I've modified haypo's patch as commented. The object's lock should be
free 99.9% of the time so we try non-blocking first and can thereby skip
releasing and re-locking the gil (to avoid a deadlock).
Added file:
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file: http://bugs.python.org/file12542/hashlibopenssl_small_lock-2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file: http://bugs.python.org/file12554/hashlibopenssl_small_lock-3.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Changes by ebfe knabberknusperh...@yahoo.de:
Removed file: http://bugs.python.org/file12557/md5module_small_locks.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
ebfe knabberknusperh...@yahoo.de added the comment:
Haypo, we can probably reduce overhead by defining ENTER_HASHLIB like this:
#define ENTER_HASHLIB(obj) \
if ((obj)-lock) { \
if (!PyThread_acquire_lock((obj)-lock, 0)) { \
Py_BEGIN_ALLOW_THREADS \
Antoine Pitrou pit...@free.fr added the comment:
I'm not sure about the approach of dynamically allocating self-lock.
Imagine you allocate this lock while another thread is between
ENTER_HASHLIB and LEAVE_HASHLIB. What happens on LEAVE_HASHLIB? The
thread tries to release a lock it hadn't
Lukas Lueg knabberknusperh...@yahoo.de added the comment:
The lock is created while having the GIL in EVP_update. No other
function releases the GIL (besides the creator-function which does not
need the local lock).
Thereby no other thread can be in between ENTER and LEAVE while the lock
is
ebfe knabberknusperh...@yahoo.de added the comment:
Releasing the GIL is somewhat expensive and should be avoided if
possible. I've moved LEAVE_HASHLIB in EVP_update so the object gets
unlocked before we call Py_END_ALLOW_THREADS. This is *only* possible
because EVP_update does not use the
ebfe knabberknusperh...@yahoo.de added the comment:
test-script
Added file: http://bugs.python.org/file12534/hashlibtest2.py
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Changes by ebfe knabberknusperh...@yahoo.de:
Removed file: http://bugs.python.org/file12461/hashopenssl_threads-3.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
ebfe knabberknusperh...@yahoo.de added the comment:
gnarf, actually it should be 'threads.append(Hasher(md))' in the script :-\
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
Releasing the GIL is somewhat expensive and should be avoided
if possible.
Another possible solution is to create a lockless object by default,
and create a lock if the data size is bigger than N (eg. 8 KB). When
the lock is
ebfe knabberknusperh...@yahoo.de added the comment:
I don't think this is actually worth the trouble. You run into situation
where one thread might decide that it needs a lock now with other
threads being in the to-be-locked-area at that time.
___
Python
STINNER Victor victor.stin...@haypocalc.com added the comment:
New implementation of finer lock grain in _hashlibopenssl: only create
the lock at the first update with more than 8 KB bytes. Object
creation/deallocation is faster if we hash less than 8 KB.
Changes between
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file: http://bugs.python.org/file12459/hashopenssl_threads-2.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
STINNER Victor victor.stin...@haypocalc.com added the comment:
Update small lock patch: replace all tabs by spaces! I forget a change
between Python trunk and my patch: there is also the error message for
Unicode object. Before:
import hashlib; hashlib.md5(abc)
TypeError: object
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file: http://bugs.python.org/file12541/hashlibopenssl_small_lock.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
Gregory P. Smith g...@krypto.org added the comment:
First: thanks for doing this. I've had a patch sitting in my own
sandbox to release the GIL while hashing for a while but I hadn't
finished testing it. It looks pretty similar to what you've done so
lets go with the patch being developed in
ebfe knabberknusperh...@yahoo.de added the comment:
I don't think so.
The interface should stay simple - python has very few such magic knobs.
People will optimize for their own box as you said - and that code will
run worse on all the others...
Besides, we've lived so long with
ebfe knabberknusperh...@yahoo.de added the comment:
haypo, the patch will not compile when WITH_THREADS is not defined. The
'lock'-member in the object structure is not present without
WITH_THREADS however the line 'if (self-lock == NULL view.len =
HASHLIB_GIL_MINSIZE)' will always refer to it.
STINNER Victor victor.stin...@haypocalc.com added the comment:
About HASHLIB_GIL_MINSIZE, I'm unable to mesure the overhead. I tried
timeit with 8190 and 8200 but the results are *very* close. I'm
running Linux, it's maybe different on other OS.
___
Python
ebfe knabberknusperh...@yahoo.de added the comment:
Here is another patch, this time for the fallback-md5-module. I know
that situations are rare where openssl is not present but threading is.
However they might occur out there and the md5module needed some love
anyway:
- The MD5 class from the
STINNER Victor victor.stin...@haypocalc.com added the comment:
ebfe Here is another patch, this time for the fallback-md5-module
Please open a separated issue for each module, this issue is already
too long and complex ;-) And it would be easier to fix other modules
when patches for hashlib
Changes by STINNER Victor victor.stin...@haypocalc.com:
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
___
Python-bugs-list mailing list
STINNER Victor victor.stin...@haypocalc.com added the comment:
Ooooh, I suggested to ebfe to remove the GIL unlock/lock, but I was
wrong :-( I hate locks! What is the right fix? Replace
ENTER_HASHLIB(self)
Py_BEGIN_ALLOW_THREADS
...
Py_END_ALLOW_THREADS
LEAVE_HASHLIB(self)
by
Antoine Pitrou pit...@free.fr added the comment:
The right fix would probably be to define ENTER_HASHLIB(self) as
Py_BEGIN_ALLOW_THREADS
PyThread_acquire_lock(self-lock)
Py_END_ALLOW_THREADS
___
Python tracker rep...@bugs.python.org
Antoine Pitrou pit...@free.fr added the comment:
Based on quick testing on my computer, we could probably put the limit
as low as 1KB. But it may be that locks are cheap under Linux.
In any case, the patch looks good, but I'm no OpenSSL expert.
___
Python
Antoine Pitrou pit...@free.fr added the comment:
Actually, your code can deadlock since ENTER_HASHLIB doesn't release the
GIL. Think about it:
// Thread A is here, holding the GIL and waiting for self-lock to be
// released by thread B
ENTER_HASHLIB(self)
Py_BEGIN_ALLOW_THREADS
// Thread B is
New submission from ebfe knabberknusperh...@yahoo.de:
The hashlib functions provided by _hashopenssl.c hold the GIL all the
time although the underlying openssl-library is basically thread-safe.
I've attached a patch (svn diff) which basically does four things:
* If python is compiled with
STINNER Victor victor.stin...@haypocalc.com added the comment:
I think that you don't use Py_BEGIN_ALLOW_THREADS /
Py_END_ALLOW_THREADS correctly: the GIL can be released when the
hashlib lock is acquired (to run hash functions in parallel threads).
So the macros should be:
#define
Antoine Pitrou pit...@free.fr added the comment:
Hi,
Very good idea. However, you don't need to discriminate for the bytes
type specifically. When a buffer is taken on the object (with
PyObject_GetBuffer()), the object is internally locked until the
buffer is release with PyBuffer_Release().
STINNER Victor victor.stin...@haypocalc.com added the comment:
EVP_copy() and EVP_get_digest_size() should call
ENTER_HASHLIB/LEAVE_HASHLIB to protect self-ctx.
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
STINNER Victor victor.stin...@haypocalc.com added the comment:
If view.len is negative, EVP_hash() may read invalid memory :-/ Be
careful of integer overflow in this block:
Py_ssize_t offset = 0, sublen = len;
while (sublen) {
unsigned int process = sublen MUNCH_SIZE ? MUNCH_SIZE
STINNER Victor victor.stin...@haypocalc.com added the comment:
New version of ebfe's patch:
- ENTER/LEAVE_HASHLIB:
* don't touch GIL in ENTER_HASHLIB (it's useless)
* add mandatory argument (explicit use of self)
- EVP_hash():
* restore Py_SAFE_DOWNCAST
* simplify the code: always
Changes by ebfe knabberknusperh...@yahoo.de:
Removed file: http://bugs.python.org/file12453/hashopenssl_threads.diff
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
ebfe knabberknusperh...@yahoo.de added the comment:
Thanks for the advices.
Antoine, maybe you could clarify the situation regarding buffer-locks
for me. In older versions of PEP 3118 the PyBUF_LOCK flag was still
present but it doesn't seem to have made it's way into the final draft.
Is it
Antoine Pitrou pit...@free.fr added the comment:
Is it save to assume that a buffer-view will not change until release()
is called - for all types supporting the buffer protocol in py3k ??
Yes, it is!
___
Python tracker rep...@bugs.python.org
STINNER Victor victor.stin...@haypocalc.com added the comment:
I've taken on haypo's patch to release the GIL only
when the buffer is larger than 10kb
You can factorize the code by moving Py_BEGIN_ALLOW_THREADS /
Py_END_ALLOW_THREADS *into* EVP_hash ;-)
10 KB is a random value or the fast
STINNER Victor victor.stin...@haypocalc.com added the comment:
New version of my md5sum.py program limited to 10 threads. New
benchmark with 160 files (size in 7..10 MB):
- Python unpatched: best=4.8 sec
- C version (/usr/bin/md5sum): best=3.6 sec
- Python patched: best=2.1 sec
As everybody
Changes by STINNER Victor victor.stin...@haypocalc.com:
Removed file: http://bugs.python.org/file12462/md5sum.py
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue4751
___
ebfe knabberknusperh...@yahoo.de added the comment:
Here is another simple benchmarker. For me it shows almost perfect
scaling (2 cores = 196% performance) if the buffer put into .update() is
large enough.
I deliberately did not move Py_BEGIN_ALLOW_THREADS into EVP_hash as we
might call this
STINNER Victor victor.stin...@haypocalc.com added the comment:
hashlibtest.py results on my Quad Core with 4 threads:
- unpatched: best=13.0 sec
- patched: best=3.25 sec
Some maths: 13.0 / 4 = 3.25 \o/
___
Python tracker rep...@bugs.python.org
61 matches
Mail list logo