Re: threading and multiprocessing deadlock

2021-12-06 Thread Dieter Maurer
Johannes Bauer wrote at 2021-12-6 00:50 +0100:
>I'm a bit confused. In my scenario I a mixing threading with
>multiprocessing. Threading by itself would be nice, but for GIL reasons
>I need both, unfortunately. I've encountered a weird situation in which
>multiprocessing Process()es which are started in a new thread don't
>actually start and so they deadlock on join.

The `multiprocessing` doc
(--> 
"https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing;)
has the following warning:
"Note that safely forking a multithreaded process is problematic."

Thus, if you use the `fork` method to start processes, some
surprises are to be expected.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: threading and multiprocessing deadlock

2021-12-06 Thread Barry Scott



> On 5 Dec 2021, at 23:50, Johannes Bauer  wrote:
> 
> Hi there,
> 
> I'm a bit confused. In my scenario I a mixing threading with
> multiprocessing. Threading by itself would be nice, but for GIL reasons
> I need both, unfortunately. I've encountered a weird situation in which
> multiprocessing Process()es which are started in a new thread don't
> actually start and so they deadlock on join.
> 
> I've created a minimal example that demonstrates the issue. I'm running
> on x86_64 Linux using Python 3.9.5 (default, May 11 2021, 08:20:37)
> ([GCC 10.3.0] on linux).
> 
> Here's the code:

I suggest that you include the threading.current_thread() in your messages.
Then you can see which thread is saying what.

Barry

> 
> 
> import time
> import multiprocessing
> import threading
> 
> def myfnc():
>   print("myfnc")
> 
> def run(result_queue, callback):
>   result = callback()
>   result_queue.put(result)
> 
> def start(fnc):
>   def background_thread():
>   queue = multiprocessing.Queue()
>   proc = multiprocessing.Process(target = run, args = (queue, 
> fnc))
>   proc.start()
>   print("join?")
>   proc.join()
>   print("joined.")
>   result = queue.get()
>   threading.Thread(target = background_thread).start()
> 
> start(myfnc)
> start(myfnc)
> start(myfnc)
> start(myfnc)
> while True:
>   time.sleep(1)
> 
> 
> What you'll see is that "join?" and "joined." nondeterministically does
> *not* appear in pairs. For example:
> 
> join?
> join?
> myfnc
> myfnc
> join?
> join?
> joined.
> joined.
> 
> What's worse is that when this happens and I Ctrl-C out of Python, the
> started Thread is still running in the background:
> 
> $ ps ax | grep minimal
> 370167 pts/0S  0:00 python3 minimal.py
> 370175 pts/2S+ 0:00 grep minimal
> 
> Can someone figure out what is going on there?
> 
> Best,
> Johannes
> -- 
> https://mail.python.org/mailman/listinfo/python-list
> 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: threading and multiprocessing deadlock

2021-12-06 Thread Johannes Bauer
Am 06.12.21 um 13:56 schrieb Martin Di Paola:
> Hi!, in short your code should work.
> 
> I think that the join-joined problem is just an interpretation problem.
> 
> In pseudo code the background_thread function does:
> 
> def background_thread()
>   # bla
>   print("join?")
>   # bla
>   print("joined")
> 
> When running this function in parallel using threads, you will probably
> get a few "join?" first before receiving any "joined?". That is because
> the functions are running in parallel.
> 
> The order "join?" then "joined" is preserved within a thread but not
> preserved globally.

Yes, completely understood and really not the issue. That these pairs
are not in sequence is fine.

> Now, I see another issue in the output (and perhaps you was asking about
> this one):
> 
> join?
> join?
> myfnc
> myfnc
> join?
> join?
> joined.
> joined.
> 
> So you have 4 "join?" that correspond to the 4 background_thread
> function calls in threads but only 2 "myfnc" and 2 "joined".

Exactly that is the issue. Then it hangs. Deadlocked.

> Could be possible that the output is truncated by accident?

No. This is it. The exact output varies, but when it hangs, it always
also does not execute the function (note the lack of "myfnc"). For example:

join?
join?
myfnc
join?
myfnc
join?
myfnc
joined.
joined.
joined.

(only three threads get started there)

join?
myfnc
join?
join?
join?
joined.

(this time only a single one made it)

join?
join?
join?
myfnc
join?
myfnc
joined.
myfnc
joined.
joined.

(three get started)

> I ran the same program and I got a reasonable output (4 "join?", "myfnc"
> and "joined"):
> 
> join?
> join?
> myfnc
> join?
> myfnc
> join?
> joined.
> myfnc
> joined.
> joined.
> myfnc
> joined.

This happens to me occasionally, but most of the time one of the
processes deadlocks. Did you consistently get four of each? What
OS/Python version were you using?

> Another issue that I see is that you are not joining the threads that
> you spawned (background_thread functions).

True, I kindof assumed those would be detached threads.

> I hope that this can guide you to fix or at least narrow the issue.

Depending on what OS/Python version you're using, that points in that
direction and kindof reinforces my belief that the code is correct.

Very curious.

Thanks & all the best,
Joe
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: threading and multiprocessing deadlock

2021-12-06 Thread Martin Di Paola

Hi!, in short your code should work.

I think that the join-joined problem is just an interpretation problem.

In pseudo code the background_thread function does:

def background_thread()
  # bla
  print("join?")
  # bla
  print("joined")

When running this function in parallel using threads, you will probably
get a few "join?" first before receiving any "joined?". That is because
the functions are running in parallel.

The order "join?" then "joined" is preserved within a thread but not
preserved globally.

Now, I see another issue in the output (and perhaps you was asking about 
this one):


join?
join?
myfnc
myfnc
join?
join?
joined.
joined.

So you have 4 "join?" that correspond to the 4 background_thread 
function calls in threads but only 2 "myfnc" and 2 "joined".


Could be possible that the output is truncated by accident?

I ran the same program and I got a reasonable output (4 "join?", "myfnc" 
and "joined"):


join?
join?
myfnc
join?
myfnc
join?
joined.
myfnc
joined.
joined.
myfnc
joined.

Another issue that I see is that you are not joining the threads that 
you spawned (background_thread functions).


I hope that this can guide you to fix or at least narrow the issue.

Thanks,
Martin.


On Mon, Dec 06, 2021 at 12:50:11AM +0100, Johannes Bauer wrote:

Hi there,

I'm a bit confused. In my scenario I a mixing threading with
multiprocessing. Threading by itself would be nice, but for GIL reasons
I need both, unfortunately. I've encountered a weird situation in which
multiprocessing Process()es which are started in a new thread don't
actually start and so they deadlock on join.

I've created a minimal example that demonstrates the issue. I'm running
on x86_64 Linux using Python 3.9.5 (default, May 11 2021, 08:20:37)
([GCC 10.3.0] on linux).

Here's the code:


import time
import multiprocessing
import threading

def myfnc():
print("myfnc")

def run(result_queue, callback):
result = callback()
result_queue.put(result)

def start(fnc):
def background_thread():
queue = multiprocessing.Queue()
proc = multiprocessing.Process(target = run, args = (queue, 
fnc))
proc.start()
print("join?")
proc.join()
print("joined.")
result = queue.get()
threading.Thread(target = background_thread).start()

start(myfnc)
start(myfnc)
start(myfnc)
start(myfnc)
while True:
time.sleep(1)


What you'll see is that "join?" and "joined." nondeterministically does
*not* appear in pairs. For example:

join?
join?
myfnc
myfnc
join?
join?
joined.
joined.

What's worse is that when this happens and I Ctrl-C out of Python, the
started Thread is still running in the background:

$ ps ax | grep minimal
370167 pts/0S  0:00 python3 minimal.py
370175 pts/2S+ 0:00 grep minimal

Can someone figure out what is going on there?

Best,
Johannes
--
https://mail.python.org/mailman/listinfo/python-list

--
https://mail.python.org/mailman/listinfo/python-list


threading and multiprocessing deadlock

2021-12-05 Thread Johannes Bauer
Hi there,

I'm a bit confused. In my scenario I a mixing threading with
multiprocessing. Threading by itself would be nice, but for GIL reasons
I need both, unfortunately. I've encountered a weird situation in which
multiprocessing Process()es which are started in a new thread don't
actually start and so they deadlock on join.

I've created a minimal example that demonstrates the issue. I'm running
on x86_64 Linux using Python 3.9.5 (default, May 11 2021, 08:20:37)
([GCC 10.3.0] on linux).

Here's the code:


import time
import multiprocessing
import threading

def myfnc():
print("myfnc")

def run(result_queue, callback):
result = callback()
result_queue.put(result)

def start(fnc):
def background_thread():
queue = multiprocessing.Queue()
proc = multiprocessing.Process(target = run, args = (queue, 
fnc))
proc.start()
print("join?")
proc.join()
print("joined.")
result = queue.get()
threading.Thread(target = background_thread).start()

start(myfnc)
start(myfnc)
start(myfnc)
start(myfnc)
while True:
time.sleep(1)


What you'll see is that "join?" and "joined." nondeterministically does
*not* appear in pairs. For example:

join?
join?
myfnc
myfnc
join?
join?
joined.
joined.

What's worse is that when this happens and I Ctrl-C out of Python, the
started Thread is still running in the background:

$ ps ax | grep minimal
 370167 pts/0S  0:00 python3 minimal.py
 370175 pts/2S+ 0:00 grep minimal

Can someone figure out what is going on there?

Best,
Johannes
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34140] Possible multiprocessing deadlock when placing too many objects in Queue()

2018-07-17 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

Closing as not a bug.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed
type: performance -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34140] Possible multiprocessing deadlock when placing too many objects in Queue()

2018-07-17 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

The problem is you're joining the child processes before draining the queue in 
the parent.

Generally, instead of building your own kind of synchronization like this, I 
would recommend you use the higher-level abstractions provided by 
multiprocessing.Pool or concurrent.futures.ProcessPoolExecutor.

By the way, this issue is mentioned precisely in the documentation:

"""
As mentioned above, if a child process has put items on a queue (and it has not 
used JoinableQueue.cancel_join_thread), then that process will not terminate 
until all buffered items have been flushed to the pipe.

This means that if you try joining that process you may get a deadlock unless 
you are sure that all items which have been put on the queue have been 
consumed. Similarly, if the child process is non-daemonic then the parent 
process may hang on exit when it tries to join all its non-daemonic children.
"""

(from https://docs.python.org/3/library/multiprocessing.html#pipes-and-queues)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34140] Possible multiprocessing deadlock when placing too many objects in Queue()

2018-07-17 Thread Xiang Zhang


Change by Xiang Zhang :


--
nosy: +davin, pitrou, xiang.zhang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34140] Possible multiprocessing deadlock when placing too many objects in Queue()

2018-07-17 Thread Horace Stoica


New submission from Horace Stoica :

I am trying to use the  multiprocessing module for a simulation on a spherical 
lattice, but the process hangs when the lattice is too large. 

In the file IssuesWithQueueMultiProcessing.py, the method createLattice(), use 
either "return(I4)" for the small lattice or "return(I5)" for the large 
lattice. 

Running the script when using the large lattice causes the process to hang 
while for the small lattice it works fine. I have tested with Python 3.5.2 and 
3.6.1 and the behavior is the same in both versions.

--
files: IssuesWithQueueMultiProcessing.tar.bz
messages: 321832
nosy: fhstoica
priority: normal
severity: normal
status: open
title: Possible multiprocessing deadlock when placing too many objects in 
Queue()
type: performance
versions: Python 3.5, Python 3.6
Added file: 
https://bugs.python.org/file47702/IssuesWithQueueMultiProcessing.tar.bz

___
Python tracker 
<https://bugs.python.org/issue34140>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34059] multiprocessing deadlock

2018-07-06 Thread Guillaume Perrault-Archambault


Guillaume Perrault-Archambault  added the comment:

A friend of mine has suggested a fix that seems to work for now (upgrade numpy 
from 1.14.3 to 1.14.5). This makes no sense at all but it does seem to work for 
now. I have a strong suspicion that this is just masking the problem and that 
it will reappear.

However, since it works I would not want you to waste any time on this. I will 
reopen if the deadlock reappears!

I do apologize if you already spent a lot of time on this.

Regards,
Guillaume

--
resolution:  -> third party
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34059] multiprocessing deadlock

2018-07-06 Thread Guillaume Perrault-Archambault


Guillaume Perrault-Archambault  added the comment:

Hi Victor and Yang,

Thanks for your fast replies.

I did initially think it could be a torch issue. Indeed, I have an
equivalent numpy testcase that does not deadlock. However, the fact that it
gets stuck inside a multiprocessing wait statement makes me think it's
still a multiprocessing issue.

I've spent two weeks full time on this issue. Over at torch forums I've had
no replies (
https://discuss.pytorch.org/t/multiprocessing-code-works-using-numpy-but-deadlocked-using-pytorch/20473
).

On stackexchange I only got a workaround suggestion that works sporadically
(
https://stackoverflow.com/questions/51093970/multiprocessing-code-works-using-numpy-but-deadlocked-using-pytorch).
Basically I can get rid of the deadlock (sometimes) if I impose only one
thread per process. But this is not a solution anyway.

I have tried stepping through the code, but because it is multiprocessed,
you cannot step through it (at least not in the conventional way, since the
main thread is not doing the heavy lifting).

I've tried adding print statements in the multiprocess library and mucking
around with it a bit, but debugging multi-processed code in this way is an
absolute nightmare because you can't even trust the order in which print
statements display on the screen. And probably more relevant, I'm out of my
league here.

I'm really at a complete dead end. I'm blocked and my work cannot progress
without fixing this issue. I'd be very grateful if you could try to
reproduce and rule out the multiprocessing library. If you need help
reproducing I can send a different testcase that deadlocked on my friend's
Mac (for him, the original testcase did not deadlock).

Testcase I attached in my original post it sometimes deadlocks and
sometimes doesn't, depending on the machine I run on. So I'm not suprised
you got no deadlock when you tried to reproduce.

I can always get it deadlocking on Linux/Mac though, by tweaking the code.

To give you a sense of how unreliably it deadlocks, just removing the for
loop in the code (which is outside the multiprocessing portion of the
code!) somehow gets rid of the deadlock. Also, it never deadlocks on
Windows.

If you could provide any help on this issue I'd be very grateful.

Regards,
Guillaume.

On Fri, Jul 6, 2018 at 11:21 AM STINNER Victor 
wrote:

>
> STINNER Victor  added the comment:
>
> IMHO it's an issue with your usage of the torch module which is not part
> of the Python stdlib, so I suggest to close this issue as "third party" or
> "not a bug".
>
> --
> nosy: +vstinner
>
> ___
> Python tracker 
> 
> ___
>

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34059] multiprocessing deadlock

2018-07-06 Thread STINNER Victor


STINNER Victor  added the comment:

IMHO it's an issue with your usage of the torch module which is not part of the 
Python stdlib, so I suggest to close this issue as "third party" or "not a bug".

--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34059] multiprocessing deadlock

2018-07-06 Thread Windson Yang


Windson Yang  added the comment:

I'm can't reproduce the deadlock, maybe it's related to torch package? Can you 
try without torch to see if this happens again?

--
nosy: +Windson Yang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34059] multiprocessing deadlock

2018-07-05 Thread Guillaume Perrault-Archambault


New submission from Guillaume Perrault-Archambault :

The simple code attached causes a deadlock in Linux.

Problem is I have to slightly muck around with it depending on the distro and 
python version to get it to deadlock.

On the cluster I use the most (python 3.6.3, CentOS Linux release 7.4.1708, 
pytorch 0.4.0 with no CUDA), the code attached causes a deadlock.

--
components: Library (Lib)
files: multiprocess_torch.py
messages: 321146
nosy: gobbedy
priority: normal
severity: normal
status: open
title: multiprocessing deadlock
type: crash
versions: Python 3.6
Added file: https://bugs.python.org/file47673/multiprocess_torch.py

___
Python tracker 
<https://bugs.python.org/issue34059>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7200] multiprocessing deadlock on Mac OS X when queue collected before process terminates

2015-02-13 Thread Davin Potts

Davin Potts added the comment:

This issue was marked as not a bug by OP a while back but for whatever reason 
it did not also get marked as closed.  Going ahead with closing it now.

--
nosy: +davin
stage: needs patch - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7200
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7200] multiprocessing deadlock on Mac OS X when queue collected before process terminates

2013-10-18 Thread Brian Quinlan

Brian Quinlan added the comment:

OK, working as intended.

--
resolution:  - invalid

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7200
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7200] multiprocessing deadlock on Mac OS X when queue collected before process terminates

2010-10-24 Thread Ask Solem

Ask Solem a...@opera.com added the comment:

Queue uses multiprocessing.util.Finalize, which uses weakrefs to track when the 
object is out of scope, so this is actually expected behavior.

IMHO it is not a very good approach, but changing the API to use explicit close 
methods is a little late at this point, I guess.

--
nosy: +asksol

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7200
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7200] multiprocessing deadlock on Mac OS X when queue collected before process terminates

2010-07-11 Thread Mark Lawrence

Changes by Mark Lawrence breamore...@yahoo.co.uk:


--
assignee:  - jnoller
nosy: +jnoller
stage:  - needs patch
versions: +Python 2.7, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7200
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: multiprocessing deadlock

2009-10-24 Thread Ishwor Gurung
Hi Brian,
I think there could be a slight problem (if I've understood your code).

 import multiprocessing
 import queue

 def _process_worker(q):
    while True:
do you really want to run it indefinitely here?

        try:
            something = q.get(block=True, timeout=0.1)
        except queue.Empty:
            return
So, if your queue is empty, why do you want to just return?

        else:
            print('Grabbed item from queue:', something)


 def _make_some_processes(q):
    processes = []
    for _ in range(10):
This is going to loop q*10 many times. Do you really want that?

        p = multiprocessing.Process(target=_process_worker, args=(q,))
OK.
        p.start()
        processes.append(p)
Here. Do you want to add it to processes list? why?

    return processes
OK.

 def _do(i):
    print('Run:', i)
    q = multiprocessing.Queue()
    for j in range(30):
        q.put(i*30+j)
30 items in the queue for each i (i*30). Cool.

    processes = _make_some_processes(q)

    while not q.empty():
        pass
why are you checking q.empty( ) here again?

 #    The deadlock only occurs on Mac OS X and only when these lines
 #    are commented out:
 #    for p in processes:
 #        p.join()
Why are you joining down here? Why not in the loop itself? I tested it
on Linux and Win32. Works fine for me. I don't know about OSX.

 for i in range(100):
    _do(i)
_do(i) is ran 100 times. Why? Is that what you want?

 Output (on Mac OS X using the svn version of py3k):
 % ~/bin/python3.2 moprocessmoproblems.py
 Run: 0
 Grabbed item from queue: 0
 Grabbed item from queue: 1
 Grabbed item from queue: 2
 ...
 Grabbed item from queue: 29
 Run: 1

And this is strange. Now, I happened to be hacking away some
multiprocessing code meself. I saw your thread so I whipped up
something that you can have a look.

 At this point the script produces no additional output. If I uncomment the
 lines above then the script produces the expected output. I don't see any
 docs that would explain this problem and I don't know what the rule would be
 e.g. you just join every process that uses a queue before  the queue is
 garbage collected.

You join the processes when you want it to return with or without
optional parameter `timeout' in this case. Let me be more specific.
The doc for Process says:

join([timeout])
Block the calling thread until the process whose join() method is
called terminates or until the optional timeout occurs.

Right. Now, the join( ) here is going to be the the called process'
join( ). If this particular join( ) waits indefinitely, then your
parent process' join( ) will _also_ wait indefinitely waiting for the
child processes' join( ) to finish up __unless__ you define a timeout
value.

 Any ideas why this is happening?

Have a look below. If I've understood your code, then it will be
reflective of your situation but with different take on the
implementation side of things (ran on Python2.6):

from multiprocessing import Process, Queue;
from Queue import Empty;
import sys;

def _process_worker(q):
try:
something = q.get(block=False, timeout=None);
except Empty:
print sys.exc_info();
else:
print 'Removed %d from the queue' %something;

def _make_some_processes():
q = Queue();

for i in range(3):
for j in range(3):
q.put(i*30+j);

while not q.empty():
p = Process(target=_process_worker, args=(q,));
p.start();
p.join();

if __name__ == __main__:
_make_some_processes();

'''
Removed 0 from the queue
Removed 1 from the queue
Removed 2 from the queue
Removed 30 from the queue
Removed 31 from the queue
Removed 32 from the queue
Removed 60 from the queue
Removed 61 from the queue
Removed 62 from the queue
'''
-- 
Regards,
Ishwor Gurung
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-24 Thread Gabriel Genellina
En Sat, 24 Oct 2009 02:48:38 -0300, Brian Quinlan br...@sweetapp.com  
escribió:

On 24 Oct 2009, at 14:10, Gabriel Genellina wrote:
En Thu, 22 Oct 2009 23:18:32 -0300, Brian Quinlan br...@sweetapp.com  
escribió:


I don't like a few things in the code:


I'm actually not looking for workarounds. I want to know if this is a  
multiprocessing bug or if I am misunderstanding the multiprocessing docs  
somehow and my demonstrated usage pattern is somehow incorrect.


Those aren't really workarounds, but things to consider when trying to  
narrow down what's causing the problem. The example is rather long as it  
is, and it's hard to tell what's wrong since there are many places thay  
might fail. The busy wait might be relevant, or not; having a thousand  
zombie processes might be relevant, or not.


I don't have an OSX system to test, but on Windows your code worked fine;  
although removing the busy wait and joining the processes made for a  
better work load (with the original code, usually only one of the  
subprocesses in each run grabbed all items from the queue)


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-24 Thread Brian Quinlan

On 24 Oct 2009, at 19:49, Gabriel Genellina wrote:

En Sat, 24 Oct 2009 02:48:38 -0300, Brian Quinlan  
br...@sweetapp.com escribió:

On 24 Oct 2009, at 14:10, Gabriel Genellina wrote:
En Thu, 22 Oct 2009 23:18:32 -0300, Brian Quinlan br...@sweetapp.com 
 escribió:


I don't like a few things in the code:


I'm actually not looking for workarounds. I want to know if this is  
a multiprocessing bug or if I am misunderstanding the  
multiprocessing docs somehow and my demonstrated usage pattern is  
somehow incorrect.


Those aren't really workarounds, but things to consider when trying  
to narrow down what's causing the problem. The example is rather  
long as it is, and it's hard to tell what's wrong since there are  
many places thay might fail.


I agree that the multiprocessing implementation is complex is there  
are a lot of spinning wheels. At this point, since no one has pointed  
out how I am misusing the module, I think that I'll just file a bug.


The busy wait might be relevant, or not; having a thousand zombie  
processes might be relevant, or not.


According to the docs:

On Unix when a process finishes but has not been joined it becomes  
a zombie. There should never be very many because each time a new  
process starts (or active_children() is called) all completed  
processes which have not yet been joined will be joined. Also calling  
a finished process’s Process.is_alive() will join the process.  Even  
so it is probably good practice to explicitly join all the processes  
that you start.


Cheers,
Brian-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-24 Thread larudwer

Brian Quinlan br...@sweetapp.com schrieb im Newsbeitrag 
news:mailman.1895.1256264717.2807.python-l...@python.org...

 Any ideas why this is happening?

 Cheers,
 Brian

IMHO your code is buggy. You run in an typical race condition.

consider following part in your code:

 def _make_some_processes(q):
 processes = []
 for _ in range(10):
 p = multiprocessing.Process(target=_process_worker, args=(q,))
 p.start()
 processes.append(p)
 return processes


p.start() may start an process right now, in 5 seconds or an week later, 
depending on how the scheduler of your OS works.

Since all your processes are working on the same queue it is -- very --  
likely that the first process got started, processed all the input and 
finished, while all the others haven't even got started. Though your first 
process exits, and your main process also exits, because the queue is empty 
now ;).

 while not q.empty():
 pass

If you where using p.join() your main process wourd terminate when the last 
process terminates !
That's an different exit condition!

When the main process terminates all the garbage collection fun happens. I 
hope you don't wonder that your Queue and the underlaying pipe got closed 
and collected!

Well now that all the work has been done, your OS may remember that someone 
sometimes in the past told him to start an process.

def _process_worker(q):
 while True:
 try:
 something = q.get(block=True, timeout=0.1)
 except queue.Empty:
 return
 else:
 print('Grabbed item from queue:', something)

The line

something = q.get(block=True, timeout=0.1)

should cause some kind of runtime error because q is already collected at 
that time.
Depending on your luck and the OS this bug may be handled or not. Obviously 
you are not lucky on OSX ;)

That's what i think happens.




 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-24 Thread Brian Quinlan


On 24 Oct 2009, at 21:37, larudwer wrote:



Brian Quinlan br...@sweetapp.com schrieb im Newsbeitrag
news:mailman.1895.1256264717.2807.python-l...@python.org...


Any ideas why this is happening?

Cheers,
Brian


IMHO your code is buggy. You run in an typical race condition.

consider following part in your code:


def _make_some_processes(q):
   processes = []
   for _ in range(10):
   p = multiprocessing.Process(target=_process_worker, args=(q,))
   p.start()
   processes.append(p)
   return processes



p.start() may start an process right now, in 5 seconds or an week  
later,

depending on how the scheduler of your OS works.


Agreed.

Since all your processes are working on the same queue it is -- very  
--

likely that the first process got started, processed all the input and
finished, while all the others haven't even got started.


Agreed.


Though your first
process exits, and your main process also exits, because the queue  
is empty

now ;).



The main process shouldn't (and doesn't exit) - the _do function exits  
(with some processes possibly still running) and the next iteration in


for i in range(100):
_do(i)

is evaluated.



   while not q.empty():
   pass


If you where using p.join() your main process wourd terminate when  
the last

process terminates !
That's an different exit condition!



When you say your main process would terminate, you mean that the  
_do function would exit, right? Because process.join() has nothing to  
do with terminating the calling process - it just blocks until process  
terminates.


When the main process terminates all the garbage collection fun  
happens. I
hope you don't wonder that your Queue and the underlaying pipe got  
closed

and collected!


I expected the queue and underlying queue and pipe to get collected.

Well now that all the work has been done, your OS may remember that  
someone

sometimes in the past told him to start an process.


Sure, that could happen at this stage. Are you saying that it is the  
user of the multiprocessing module's responsibility to ensure that the  
queue is not collected in the parent process until all the child  
processes using it have exited? Actually, causing the queues to never  
be collected fixes the deadlock:


+ p = []
def _do(i):
print('Run:', i)
q = multiprocessing.Queue()
+  p.append(q)
print('Created queue')
for j in range(30):
q.put(i*30+j)
processes = _make_some_processes(q)
print('Created processes')

while not q.empty():
pass
print('Q is empty')

This behavior is counter-intuitive and, as far as I can tell, not  
documented anywhere. So it feels like a bug.


Cheers,
Brian


def _process_worker(q):
   while True:
   try:
   something = q.get(block=True, timeout=0.1)
   except queue.Empty:
   return
   else:
   print('Grabbed item from queue:', something)


The line

something = q.get(block=True, timeout=0.1)

should cause some kind of runtime error because q is already  
collected at

that time.
Depending on your luck and the OS this bug may be handled or not.  
Obviously

you are not lucky on OSX ;)

That's what i think happens.







--
http://mail.python.org/mailman/listinfo/python-list


-- 
http://mail.python.org/mailman/listinfo/python-list


[issue7200] multiprocessing deadlock on Mac OS X when queue collected before process terminates

2009-10-24 Thread Brian Quinlan

New submission from Brian Quinlan br...@sweetapp.com:

This code:

import multiprocessing
import queue

def _process_worker(q):
while True:
try:
something = q.get(block=True, timeout=0.1)
except queue.Empty:
return
else:
pass
# print('Grabbed item from queue:', something)


def _make_some_processes(q):
processes = []
for _ in range(10):
p = multiprocessing.Process(target=_process_worker, args=(q,))
p.start()
processes.append(p)
return processes

#p = []
def _do(i):
print('Run:', i)
q = multiprocessing.Queue()
#p.append(q)
print('Created queue')
for j in range(30):
q.put(i*30+j)
processes = _make_some_processes(q)
print('Created processes')

while not q.empty():
pass
print('Q is empty')

for i in range(100):
_do(i)

Produces this output on Mac OS X (it produces the expected output on
Linux and Windows):

Run: 0
Created queue
Grabbed item from queue: 0
...
Grabbed item from queue: 29
Created processes
Q is empty
Run: 1
Created queue
Grabbed item from queue: 30
...
Grabbed item from queue: 59
Created processes
Q is empty
Run: 2
Created queue
Created processes
no further output

Changing the code as follows:

+ p = []
def _do(i):
print('Run:', i)
q = multiprocessing.Queue()
+   p.append(q)
print('Created queue')
for j in range(30):
q.put(i*30+j)
processes = _make_some_processes(q)
print('Created processes')

while not q.empty():
pass
print('Q is empty')

fixes the deadlock. So it looks like if a multiprocessing.Queue is
collected with sub-processes still using it then calling some methods on
other multiprocessing.Queues with deadlock.

--
components: Library (Lib)
messages: 94440
nosy: bquinlan
severity: normal
status: open
title: multiprocessing deadlock on Mac OS X when queue collected before process 
terminates
type: behavior
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7200
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: multiprocessing deadlock

2009-10-23 Thread paulC
On Oct 23, 3:18 am, Brian Quinlan br...@sweetapp.com wrote:
 My test reduction:

 import multiprocessing
 import queue

 def _process_worker(q):
      while True:
          try:
              something = q.get(block=True, timeout=0.1)
          except queue.Empty:
              return
          else:
              print('Grabbed item from queue:', something)

 def _make_some_processes(q):
      processes = []
      for _ in range(10):
          p = multiprocessing.Process(target=_process_worker, args=(q,))
          p.start()
          processes.append(p)
      return processes

 def _do(i):
      print('Run:', i)
      q = multiprocessing.Queue()
      for j in range(30):
          q.put(i*30+j)
      processes = _make_some_processes(q)

      while not q.empty():
          pass

 #    The deadlock only occurs on Mac OS X and only when these lines
 #    are commented out:
 #    for p in processes:
 #        p.join()

 for i in range(100):
      _do(i)

 --

 Output (on Mac OS X using the svn version of py3k):
 % ~/bin/python3.2 moprocessmoproblems.py
 Run: 0
 Grabbed item from queue: 0
 Grabbed item from queue: 1
 Grabbed item from queue: 2
 ...
 Grabbed item from queue: 29
 Run: 1

 At this point the script produces no additional output. If I uncomment
 the lines above then the script produces the expected output. I don't
 see any docs that would explain this problem and I don't know what the
 rule would be e.g. you just join every process that uses a queue before
   the queue is garbage collected.

 Any ideas why this is happening?

 Cheers,
 Brian

I can't promise a definitive answer but looking at the doc.s:-

isAlive()
Return whether the thread is alive.

Roughly, a thread is alive from the moment the start() method
returns until its run() method terminates. The module function
enumerate() returns a list of all alive threads.

I guess that the word 'roughly' indicates that returning from the start
() call does not mean that all the threads have actually started, and
so calling join is illegal. Try calling isAlive on all the threads
before returning from _make_some_processes.

Regards, Paul C.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-23 Thread Brian Quinlan

On 24 Oct 2009, at 00:02, paulC wrote:


On Oct 23, 3:18 am, Brian Quinlan br...@sweetapp.com wrote:

My test reduction:

import multiprocessing
import queue

def _process_worker(q):
 while True:
 try:
 something = q.get(block=True, timeout=0.1)
 except queue.Empty:
 return
 else:
 print('Grabbed item from queue:', something)

def _make_some_processes(q):
 processes = []
 for _ in range(10):
 p = multiprocessing.Process(target=_process_worker,  
args=(q,))

 p.start()
 processes.append(p)
 return processes

def _do(i):
 print('Run:', i)
 q = multiprocessing.Queue()
 for j in range(30):
 q.put(i*30+j)
 processes = _make_some_processes(q)

 while not q.empty():
 pass

#The deadlock only occurs on Mac OS X and only when these lines
#are commented out:
#for p in processes:
#p.join()

for i in range(100):
 _do(i)

--

Output (on Mac OS X using the svn version of py3k):
% ~/bin/python3.2 moprocessmoproblems.py
Run: 0
Grabbed item from queue: 0
Grabbed item from queue: 1
Grabbed item from queue: 2
...
Grabbed item from queue: 29
Run: 1

At this point the script produces no additional output. If I  
uncomment

the lines above then the script produces the expected output. I don't
see any docs that would explain this problem and I don't know what  
the
rule would be e.g. you just join every process that uses a queue  
before

  the queue is garbage collected.

Any ideas why this is happening?

Cheers,
Brian


I can't promise a definitive answer but looking at the doc.s:-

isAlive()
   Return whether the thread is alive.

   Roughly, a thread is alive from the moment the start() method
returns until its run() method terminates. The module function
enumerate() returns a list of all alive threads.

I guess that the word 'roughly' indicates that returning from the  
start

() call does not mean that all the threads have actually started, and
so calling join is illegal. Try calling isAlive on all the threads
before returning from _make_some_processes.

Regards, Paul C.
--
http://mail.python.org/mailman/listinfo/python-list


Hey Paul,

I guess I was unclear in my explanation - the deadlock only happens  
when I *don't* call join.


Cheers,
Brian


--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-23 Thread paulC

 Hey Paul,

 I guess I was unclear in my explanation - the deadlock only happens  
 when I *don't* call join.

 Cheers,
 Brian

Whoops, my bad.

Have you tried replacing prints with writing a another output Queue?
I'm wondering if sys.stdout has a problem.

Regards, Paul C.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-23 Thread Brian Quinlan


On 24 Oct 2009, at 06:01, paulC wrote:



Hey Paul,

I guess I was unclear in my explanation - the deadlock only happens
when I *don't* call join.

Cheers,
Brian


Whoops, my bad.

Have you tried replacing prints with writing a another output Queue?
I'm wondering if sys.stdout has a problem.


Removing the print from the subprocess doesn't prevent the deadlock.

Cheers,
Brian
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-23 Thread Gabriel Genellina
En Thu, 22 Oct 2009 23:18:32 -0300, Brian Quinlan br...@sweetapp.com  
escribió:


I don't like a few things in the code:


def _do(i):
 print('Run:', i)
 q = multiprocessing.Queue()
 for j in range(30):
 q.put(i*30+j)
 processes = _make_some_processes(q)

 while not q.empty():
 pass


I'd use time.sleep(0.1) or something instead of this busy wait, but see  
below.



#The deadlock only occurs on Mac OS X and only when these lines
#are commented out:
#for p in processes:
#p.join()


I don't know how multiprocessing deals with it, but if you don't join() a  
process it may become a zombie, so it's probably better to always join  
them. In that case I'd just remove the wait for q.empty() completely.



for i in range(100):
 _do(i)


Those lines should be guarded with: if __name__ == '__main__':

I don't know if fixing those things will fix your problem, but at least  
the code will look neater...


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing deadlock

2009-10-23 Thread Brian Quinlan


On 24 Oct 2009, at 14:10, Gabriel Genellina wrote:

En Thu, 22 Oct 2009 23:18:32 -0300, Brian Quinlan  
br...@sweetapp.com escribió:


I don't like a few things in the code:


def _do(i):
print('Run:', i)
q = multiprocessing.Queue()
for j in range(30):
q.put(i*30+j)
processes = _make_some_processes(q)

while not q.empty():
pass


I'd use time.sleep(0.1) or something instead of this busy wait, but  
see below.


This isn't my actual code, it is a simplification of my code designed  
to minimally demonstrate a possible bug in multiprocessing.





#The deadlock only occurs on Mac OS X and only when these lines
#are commented out:
#for p in processes:
#p.join()


I don't know how multiprocessing deals with it, but if you don't  
join() a process it may become a zombie, so it's probably better to  
always join them. In that case I'd just remove the wait for  
q.empty() completely.


I'm actually not looking for workarounds. I want to know if this is a  
multiprocessing bug or if I am misunderstanding the multiprocessing  
docs somehow and my demonstrated usage pattern is somehow incorrect.


Cheers,
Brian




for i in range(100):
_do(i)


Those lines should be guarded with: if __name__ == '__main__':

I don't know if fixing those things will fix your problem, but at  
least the code will look neater...


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


--
http://mail.python.org/mailman/listinfo/python-list


multiprocessing deadlock

2009-10-22 Thread Brian Quinlan

My test reduction:

import multiprocessing
import queue

def _process_worker(q):
while True:
try:
something = q.get(block=True, timeout=0.1)
except queue.Empty:
return
else:
print('Grabbed item from queue:', something)


def _make_some_processes(q):
processes = []
for _ in range(10):
p = multiprocessing.Process(target=_process_worker, args=(q,))
p.start()
processes.append(p)
return processes

def _do(i):
print('Run:', i)
q = multiprocessing.Queue()
for j in range(30):
q.put(i*30+j)
processes = _make_some_processes(q)

while not q.empty():
pass

#The deadlock only occurs on Mac OS X and only when these lines
#are commented out:
#for p in processes:
#p.join()

for i in range(100):
_do(i)

--

Output (on Mac OS X using the svn version of py3k):
% ~/bin/python3.2 moprocessmoproblems.py
Run: 0
Grabbed item from queue: 0
Grabbed item from queue: 1
Grabbed item from queue: 2
...
Grabbed item from queue: 29
Run: 1

At this point the script produces no additional output. If I uncomment 
the lines above then the script produces the expected output. I don't 
see any docs that would explain this problem and I don't know what the 
rule would be e.g. you just join every process that uses a queue before 
 the queue is garbage collected.


Any ideas why this is happening?

Cheers,
Brian
--
http://mail.python.org/mailman/listinfo/python-list