[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: The test passes on all the buildbots, closing. greg, thanks for reporting this! -- resolution: - fixed stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: Updated patch. -- Added file: http://bugs.python.org/file22545/heap_gc_deadlock_lockless.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___diff -r fcf242243d46 Lib/multiprocessing/heap.py --- a/Lib/multiprocessing/heap.py Sun Jun 26 15:29:27 2011 +0200 +++ b/Lib/multiprocessing/heap.py Sat Jul 02 10:59:15 2011 +0200 @@ -101,6 +101,8 @@ self._stop_to_block = {} self._allocated_blocks = set() self._arenas = [] +# list of pending blocks to free - see free() comment below +self._pending_free_blocks = [] @staticmethod def _roundup(n, alignment): @@ -175,15 +177,39 @@ return start, stop +def _free_pending_blocks(self): +# Free all the blocks in the pending list - called with the lock held. +while True: +try: +block = self._pending_free_blocks.pop() +except IndexError: +break +self._allocated_blocks.remove(block) +self._free(block) + def free(self, block): # free a block returned by malloc() +# Since free() can be called asynchronously by the GC, it could happen +# that it's called while self._lock is held: in that case, +# self._lock.acquire() would deadlock (issue #12352). To avoid that, a +# trylock is used instead, and if the lock can't be acquired +# immediately, the block is added to a list of blocks to be freed +# synchronously sometimes later from malloc() or free(), by calling +# _free_pending_blocks() (appending and retrieving from a list is not +# strictly thread-safe but under cPython it's atomic thanks to the GIL). assert os.getpid() == self._lastpid -self._lock.acquire() -try: -self._allocated_blocks.remove(block) -self._free(block) -finally: -self._lock.release() +if not self._lock.acquire(0): +# can't acquire the lock right now, add the block to the list of +# pending blocks to free +self._pending_free_blocks.append(block) +else: +# we hold the lock +try: +self._free_pending_blocks() +self._allocated_blocks.remove(block) +self._free(block) +finally: +self._lock.release() def malloc(self, size): # return a block of right size (possibly rounded up) @@ -191,6 +217,7 @@ if os.getpid() != self._lastpid: self.__init__() # reinitialize after fork self._lock.acquire() +self._free_pending_blocks() try: size = self._roundup(max(size,1), self._alignment) (arena, start, stop) = self._malloc(size) diff -r fcf242243d46 Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py Sun Jun 26 15:29:27 2011 +0200 +++ b/Lib/test/test_multiprocessing.py Sat Jul 02 10:59:15 2011 +0200 @@ -1738,6 +1738,29 @@ self.assertTrue((arena != narena and nstart == 0) or (stop == nstart)) +def test_free_from_gc(self): +# Check that freeing of blocks by the garbage collector doesn't deadlock +# (issue #12352). +# Make sure the GC is enabled, and set lower collection thresholds to +# make collections more frequent (and increase the probability of +# deadlock). +if gc.isenabled(): +thresholds = gc.get_threshold() +self.addCleanup(gc.set_threshold, *thresholds) +else: +gc.enable() +self.addCleanup(gc.disable) +gc.set_threshold(10) + +# perform numerous block allocations, with cyclic references to make +# sure objects are collected asynchronously by the gc +for i in range(5000): +a = multiprocessing.heap.BufferWrapper(1) +b = multiprocessing.heap.BufferWrapper(1) +# circular references +a.buddy = b +b.buddy = a + # # # ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Changes by Charles-François Natali neolo...@free.fr: Added file: http://bugs.python.org/file22546/heap_gc_deadlock_lockless.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Changes by Charles-François Natali neolo...@free.fr: Removed file: http://bugs.python.org/file22477/heap_gc_deadlock.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Changes by Charles-François Natali neolo...@free.fr: Removed file: http://bugs.python.org/file22490/heap_gc_deadlock_lockless.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Changes by Charles-François Natali neolo...@free.fr: Removed file: http://bugs.python.org/file22545/heap_gc_deadlock_lockless.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
STINNER Victor victor.stin...@haypocalc.com added the comment: The last heap_gc_deadlock_lockless.diff looks good. Note: please try to use different filenames for different versions of the same patch. For example, add a number (heap_gc_deadlock_lockless-2.diff) to the name. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Antoine Pitrou pit...@free.fr added the comment: +if gc.isenabled(): +thresholds = gc.get_threshold() +self.addCleanup(gc.set_threshold, *thresholds) +else: +gc.enable() +self.addCleanup(gc.disable) It seems you won't restore the thresholds if the GC wasn't enabled at first. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Roundup Robot devnull@devnull added the comment: New changeset 96a0788583c6 by Charles-François Natali in branch '2.7': Issue #12352: Fix a deadlock in multiprocessing.Heap when a block is freed by http://hg.python.org/cpython/rev/96a0788583c6 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Roundup Robot devnull@devnull added the comment: New changeset 874143242d79 by Charles-François Natali in branch '2.7': Issue #12352: In test_free_from_gc(), restore the GC thresholds even if the GC http://hg.python.org/cpython/rev/874143242d79 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Roundup Robot devnull@devnull added the comment: New changeset 0d4ca1e77205 by Charles-François Natali in branch '3.1': Issue #12352: Fix a deadlock in multiprocessing.Heap when a block is freed by http://hg.python.org/cpython/rev/0d4ca1e77205 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Roundup Robot devnull@devnull added the comment: New changeset 37606505b227 by Charles-François Natali in branch '3.2': Merge issue #12352: Fix a deadlock in multiprocessing.Heap when a block is http://hg.python.org/cpython/rev/37606505b227 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Roundup Robot devnull@devnull added the comment: New changeset fd8dc3746992 by Charles-François Natali in branch 'default': Merge issue #12352: Fix a deadlock in multiprocessing.Heap when a block is http://hg.python.org/cpython/rev/fd8dc3746992 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: Nice work! I also think heap_gc_deadlock_lockless.diff is good, except for Victor's reservation: is it deliberate that you reversed the following two statements in _free_pending_blocks(), compared to the code in free()? + self._free(block) + self._allocated_blocks.remove(block) No, it's not deliberate (it shouldn't have any impact since they're protected by the mutex though). As for calling _free_pending_blocks() a second time, I'm not sure that's necessary, I find the code simpler and cleaner that way. I'll provide a new patch in a couple days (no access to my development box right now). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Antoine Pitrou pit...@free.fr added the comment: Nice work! I also think heap_gc_deadlock_lockless.diff is good, except for Victor's reservation: is it deliberate that you reversed the following two statements in _free_pending_blocks(), compared to the code in free()? +self._free(block) +self._allocated_blocks.remove(block) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
STINNER Victor victor.stin...@haypocalc.com added the comment: heap_gc_deadlock_lockless.diff: _free_pending_blocks() and free() execute the following instructions in a different order, is it a problem? +self._free(block) +self._allocated_blocks.remove(block) vs +self._allocated_blocks.remove(block) +self._free(block) You may call _free_pending_blocks() just after loack.acquire() and a second time before before lock.release()... it is maybe overkill, but it should reduce the probability of the delayed free problem. You may document that _pending_free_blocks.append() and _pending_free_blocks.pop() are atomic in CPython and don't need a specific lock. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
STINNER Victor victor.stin...@haypocalc.com added the comment: You may document that _pending_free_blocks.append() and _pending_free_blocks.pop() are atomic in CPython and don't need a specific lock. Oops, i skipped complelty your long comment explaining everything! It is enough. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
STINNER Victor victor.stin...@haypocalc.com added the comment: There are different technics to workaround this issue. My preferred is heap_gc_deadlock_lockless.diff because it has less border effect and have a well defined behaviour. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: [...] I don't like touching such global variable, but you are right. Well, I don't like it either, but I can't really think of any other solution. Antoine, any thought on that? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: You are probably right. Can't we use a lock-less list? list.append is atomic thanks to the GIL, isn't it? But I don't know how to implement the lock-less list consumer. It would be nice to have a function to remove and return the content of the list, an atomic content=mylist[:]; del mylist[:] function. While not just something like: While True: try: block = list.pop() except IndexError: break _free(block) Lock-less lists are not strictly atomic (only on cPython), but I doubt that gc.disable() is available and works on every Python interpreter anyway... So the idea would be: - in free(), perform a trylock - if trylock fails, append the block to a list of pending blocks to free - if trylock succeeds, free the pending blocks and proceed as usual (do the same thing in malloc()) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: Here's a patch based on the second approach. -- Added file: http://bugs.python.org/file22490/heap_gc_deadlock_lockless.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___diff -r fca745bc70be Lib/multiprocessing/heap.py --- a/Lib/multiprocessing/heap.py Sat Jun 25 16:31:06 2011 +0200 +++ b/Lib/multiprocessing/heap.py Sun Jun 26 19:57:56 2011 +0200 @@ -101,6 +101,8 @@ self._stop_to_block = {} self._allocated_blocks = set() self._arenas = [] +# list of pending blocks to free - see free() comment below +self._pending_free_blocks = [] @staticmethod def _roundup(n, alignment): @@ -175,15 +177,39 @@ return start, stop +def _free_pending_blocks(self): +# Free all the blocks in the pending list - called with the lock held. +while True: +try: +block = self._pending_free_blocks.pop() +except IndexError: +break +self._free(block) +self._allocated_blocks.remove(block) + def free(self, block): # free a block returned by malloc() +# Since free() can be called asynchronously by the GC, it could happen +# that it's called while self._lock is held: in that case, +# self._lock.acquire() would deadlock (issue #12352). To avoid that, a +# trylock is used instead, and if the lock can't be acquired +# immediately, the block is added to a list of blocks to be freed +# synchronously sometimes later from malloc() or free(), by calling +# _free_pending_blocks() (appending and retrieving from a list is not +# strictly thread-safe but under cPython it's atomic thanks to the GIL). assert os.getpid() == self._lastpid -self._lock.acquire() -try: -self._allocated_blocks.remove(block) -self._free(block) -finally: -self._lock.release() +if not self._lock.acquire(0): +# can't acquire the lock, add it to the list of pending blocks to +# free +self._pending_free_blocks.append(block) +else: +# we hold the lock +try: +self._free_pending_blocks() +self._allocated_blocks.remove(block) +self._free(block) +finally: +self._lock.release() def malloc(self, size): # return a block of right size (possibly rounded up) @@ -191,6 +217,7 @@ if os.getpid() != self._lastpid: self.__init__() # reinitialize after fork self._lock.acquire() +self._free_pending_blocks() try: size = self._roundup(max(size,1), self._alignment) (arena, start, stop) = self._malloc(size) diff -r fca745bc70be Lib/test/test_multiprocessing.py --- a/Lib/test/test_multiprocessing.py Sat Jun 25 16:31:06 2011 +0200 +++ b/Lib/test/test_multiprocessing.py Sun Jun 26 19:57:56 2011 +0200 @@ -1737,7 +1737,31 @@ (narena, nstart, nstop) = all[i+1][:3] self.assertTrue((arena != narena and nstart == 0) or (stop == nstart)) - + +def test_free_from_gc(self): +# Check that freeing of blocks by the garbage collector doesn't deadlock +# (issue #12352). +# Make sure the GC is enabled, and set lower collection thresholds to +# make collections more frequent (and increase the probability of +# deadlock). +if gc.isenabled(): +thresholds = gc.get_threshold() +self.addCleanup(gc.set_threshold, *thresholds) +else: +gc.enable() +self.addCleanup(gc.disable) +gc.set_threshold(10) + +# perform numerous block allocations, with cyclic references to make +# sure objects are collected asynchronously by the gc +for i in range(5000): +a = multiprocessing.heap.BufferWrapper(1) +b = multiprocessing.heap.BufferWrapper(1) +# circular references +a.buddy = b +b.buddy = a + + # # # ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: Patch (with test) attached. It disables the garbage collector inside critical sections. Of course, if another thread re-enables the gc while the current thread is inside a critical section, things can break (the probability is really low, but who knows). I can't think of any satisfying solution, since it's tricky, because we don't want it to be thread-safe but reentrant, which is much more difficult. I've got another patch which does the following: - in free(), perform a trylock of the mutex - if the trylock fails, then create a new Finalizer to postpone the freeing of the same block to a later time, when the gc is called - the only problem is that I have to build a dummy reference cycle to pass it to Finalize if I want free() to be called by the GC later (and not when the object's reference count drops to 0, otherwise we would get an infinite recursion). Another solution would be to make the Finalizer callback run lockless, maybe just set add the block number to a list/set, and perform the freeing of pending blocks synchronously when malloc() is called (or the heap is finalized). There are two drawbacks to that: - adding an element to a set is not guaranteed to be thread-safe (well, it is under cPython because of the GIL) - freeing blocks synchronously means that the blocks won't be freed until malloc() is called (which could be never) Suggestions welcome. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Changes by Charles-François Natali neolo...@free.fr: -- keywords: +patch Added file: http://bugs.python.org/file22476/heap_gc_deadlock.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Changes by Charles-François Natali neolo...@free.fr: Removed file: http://bugs.python.org/file22476/heap_gc_deadlock.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Changes by Charles-François Natali neolo...@free.fr: Added file: http://bugs.python.org/file22477/heap_gc_deadlock.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
STINNER Victor victor.stin...@haypocalc.com added the comment: Or you can combine your two ideas: in free(), perform a trylock of the mutex if the trylock fails, then create a new Finalizer to postpone the freeing of the same block to a later time,... ... perform the freeing of pending blocks synchronously when malloc() is called If free() is called indirectly from malloc() (by the garbage collector), free() adds the block to free in a pending free list. Pseudo code: --- def __init__(self): ... self._pending_free = queue.Queue() def _exec_pending_free(self): while True: try: block = self._pending_free.get_nowait() except queue.Empty: break self._free(block) def free(self, block): if self._lock.acquire(False): self._exec_pending_free() self._free(block) else: # malloc holds the lock self._pending_free.append(block) def malloc(): with self._lock: self._malloc() self._exec_pending_free() --- Problem: if self._pending_free.append() appends whereas malloc() already exited, the free will have to wait until the next call to malloc() or free(). I don't know if this case (free() called while malloc() is running, but malloc() exits before free()) really occur: this issue is a deadlock because free() is called from malloc(), and so malloc() waits until the free() is done. It might occur if the garbage collector calls free after _exec_pending_free() but before releasing the lock. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: [...] def free(self, block): if self._lock.acquire(False): self._exec_pending_free() self._free(block) else: # malloc holds the lock self._pending_free.append(block) _pending_free uses a lock internally to make it thread-safe, so I think this will have exactly the same problem (the trylock can fail in case of contention or free() from multiple threads, thus we can't be sure that the else clause is executed on behalf of the garbage collector and it won't run while we're adding the block to _pending_free). Anyway, this seems complicated and error-prone, disabling the gc seems the most straightforward way to handle that. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
STINNER Victor victor.stin...@haypocalc.com added the comment: _pending_free uses a lock internally to make it thread-safe, so I think this will have exactly the same problem You are probably right. Can't we use a lock-less list? list.append is atomic thanks to the GIL, isn't it? But I don't know how to implement the lock-less list consumer. It would be nice to have a function to remove and return the content of the list, an atomic content=mylist[:]; del mylist[:] function. (the trylock can fail in case of contention or free() from multiple threads, thus we can't be sure that the else clause is executed on behalf of the garbage collector and it won't run while we're adding the block to _pending_free) If two threads call free at same time, the second (taking the GIL) will add the block to pending_free. Anyway, this seems complicated and error-prone, disabling the gc seems the most straightforward way to handle that. I don't like touching such global variable, but you are right. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: The obvious solution is to use a recursive lock instead. Note that it's not really a solution, just a workaround to avoid deadlocks, become this might lead to a corruption if free is called while the heap is in an inconsistent state. I have to think some more about a final solution, but I'd like to check first that this is really what's happening here. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
greg.ath gathan...@gmail.com added the comment: Hi, I also wonder about the performance cost of a recursive lock. I am still unable to reproduce the bug in a simple script. Looking closely to the gdb stack, there is that frame: Frame 0x13be190, for file /usr/lib/python2.6/multiprocessing/heap.py, line 173 I understand that python reuses only the beginning of a memory block, so it frees the remaining of the block. I use both Value(c_int) and Value(c_double), which have different sizes. That may explain that behaviour. in heap.py, in the malloc function: 167 self._lock.acquire() 168 try: 169 size = self._roundup(max(size,1), self._alignment) 170 (arena, start, stop) = self._malloc(size) 171 new_stop = start + size 172 if new_stop stop: 173 self._free((arena, new_stop, stop)) Thanks for your help 2011/6/21 Charles-François Natali rep...@bugs.python.org: Charles-François Natali neolo...@free.fr added the comment: The obvious solution is to use a recursive lock instead. Note that it's not really a solution, just a workaround to avoid deadlocks, become this might lead to a corruption if free is called while the heap is in an inconsistent state. I have to think some more about a final solution, but I'd like to check first that this is really what's happening here. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: Looking closely to the gdb stack, there is that frame: Yeah, but it calls _free, which runs unlocked. That's not the problem. I am still unable to reproduce the bug in a simple script. Try with this one: import multiprocessing.heap tab = [] for i in range(10): print(i) b = multiprocessing.heap.BufferWrapper(10) # create a circular reference (we want GC and not refcount collection when # the block goes out of scope) b.tab = tab tab.append(b) # drop buffers refcount to 0 to make them eligible to GC if i % 100 == 0: del tab tab = [] It deadlocks pretty quickly (well, on my box). And, as expected, disabling/enabling the GC inside malloc solves the problem. I have to think a little bit more for a clean solution. -- nosy: +pitrou ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
STINNER Victor victor.stin...@haypocalc.com added the comment: I also wonder about the performance cost of a recursive lock. An alternative is to disable the garbage collector in malloc(): def malloc(self, size): ... enabled = gc.isenabled() if enabled: # disable the garbage collector to avoid a deadlock if block # is freed (if self.free() is called) gc.disable() try: with self._lock: size = self._roundup(max(size,1), self._alignment) ... return block finally: if enabled: gc.enable() gc.disable() and gc.enable() just set an internal flag and so should be fast. -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
Charles-François Natali neolo...@free.fr added the comment: Thanks for reporting this. There's indeed a bug which can lead to this deadlock. Relevant code in Lib/multiprocessing/heap.py - the BufferWrapper class uses a single Heap() shared among instances, protected by a mutex (threading.Lock), from which blocks are allocated - when a BufferedWrapper is allocated, a multiprocessing.Finalizer is installed to free the corresponding block allocated from the Heap - if another BufferedWrapper is garbage collected while the mutex protecting the Heap is held (in your case, while a new BufferedWrapper is allocated), the corresponding finalizer will try to free the block from the heap - free tries to lock the mutex - deadlock The obvious solution is to use a recursive lock instead. Could you try your application after changing: class Heap(object): _alignment = 8 def __init__(self, size=mmap.PAGESIZE): self._lastpid = os.getpid() self._lock = threading.Lock() to class Heap(object): _alignment = 8 def __init__(self, size=mmap.PAGESIZE): self._lastpid = os.getpid() - self._lock = threading.RLock() One could probably reproduce this by allocating and freeing many multiprocessing.Values, preferably with a lower GC threshold. -- nosy: +neologix ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12352] multiprocessing.Value() hangs
New submission from greg.ath gathan...@gmail.com: Hi, My multithreaded application uses multithreading.Value() to ensure thread-safe operations on shared data. For unexpected reasons, after some change in my code, the function will consistently hang. I did a gdb backtrace of the hanging process, and I discovered that the multiprocessing.head.py tries to acquire twice a same non recursive lock. The first aquire is done in the malloc function: #61 call_function (f= Frame 0x13be190, for file /usr/lib/python2.6/multiprocessing/heap.py, line 173, in malloc (self=Heap(_stop_to_block={}, _lengths=[], _lock=thread.lock at remote 0x7f00fc770eb8, _allocated_blocks=set([... The second aquire is done in the free function: #3 0x004a7c5e in call_function (f= Frame 0x1662d50, for file /usr/lib/python2.6/multiprocessing/heap.py, line 155, in free (self=Heap(_stop_to_block={}, _lengths=[], _lock=thread.lock at remote 0x7f00fc770eb8, _allocated_blocks=set([... I don't understand the link between these two method calls, so I am unable to write an easy script to reproduce the problem. I would say that some garbage collection was done within the malloc, which called the free. Python 2.6.5 Linux dev 2.6.32-25-server #45-Ubuntu SMP Sat Oct 16 20:06:58 UTC 2010 x86_64 GNU/Linux -- components: None files: gdb_stack.txt messages: 138506 nosy: greg.ath priority: normal severity: normal status: open title: multiprocessing.Value() hangs type: behavior versions: Python 2.6 Added file: http://bugs.python.org/file22393/gdb_stack.txt ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue12352 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com