[issue34781] infinite waiting in multiprocessing.Pool

2022-01-27 Thread Antonio Vázquez Blanco

Change by Antonio Vázquez Blanco :


--
nosy: +antonio.vazquez

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2021-10-17 Thread Myles Steinhauser


Change by Myles Steinhauser :


--
nosy: +myles.steinhauser

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2019-02-25 Thread Dongyan Li


Dongyan Li  added the comment:

I got the same issue with Python 3.7.2 on Windows Build 14393. Seems that the 
program got stuck on the `waiter.acquire()` method.

--
nosy: +Dongyan Li

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-26 Thread Tomáš Bouda

Tomáš Bouda  added the comment:

By now I have spent several days trying to reproduce the behaviour in 
production environment with debugger attached. Unfortunately, no success. On 
the other hand yesterday the application froze, again, and colleague today 
experienced the problem in his script, too. (talking about RHEL)

Dealing with this kind of problem is always very frustrating.

By now, I agree with @pitrou that OSX/RHEL could be two different problems. In 
advance, I tried the approach by @calimeroteknik and this would actually make 
sense.

If the child process receives a signal (SIGTERM or SIGSEGV), parent waits 
forever. We do call 3rd party libraries and segfault is indeed possible. I've 
tried to send signal to a child and script really froze. By now, it seems to be 
the most probable explanation.

OSX debugger may also be buggy, yesterday I completely broke my system just by 
trying my original script, leading to a regular segfaults and system restart 
(never happened before).

Since I can't reproduce the problem under controlled conditions, I am ok with 
closing this bug. The script by @calimeroteknik seems to be pointing in the 
same direction and I think this may solve our problem, too.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-26 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

@calimeroteknik, this doesn't seem to have anything to do with the issue at 
hand.  Please open a separate issue with your scripts.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-26 Thread Antoine Pitrou


Change by Antoine Pitrou :


Removed file: https://bugs.python.org/file47827/hang.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-26 Thread Antoine Pitrou


Change by Antoine Pitrou :


Removed file: https://bugs.python.org/file47828/except-out.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-26 Thread calimeroteknik


calimeroteknik  added the comment:

Attaching the version that randomly raises ChildProcessError: [Errno 10] No 
child processes.
The child process is lost in limbo if we don't sleep/print after creating it.

--
Added file: https://bugs.python.org/file47828/except-out.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-26 Thread calimeroteknik


calimeroteknik  added the comment:

A friend has found a very simple example that triggers such an issue in a very 
reproducible manner.

Attached two versions, one where the child process mysteriously disappears in 
the cpython interpreter.

pypy is unaffected.

--
nosy: +calimeroteknik
Added file: https://bugs.python.org/file47827/hang.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-26 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

If I understand correctly you have two cases:
- the standalone script hangs with 3.6 on OS X
- a much more involved use case hangs with 2.7 on RedHat

It's possible you are hitting two different bugs.  The 2.7 issue may be due to 
third-party packages or other issues.

It would be nice if another OS X user could try your reproducer script on 3.6+ 
and find out whether they can reproduce/diagnose the issue at all.

--
nosy: +ned.deily, ronaldoussoren

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-25 Thread Tomáš Bouda

Tomáš Bouda  added the comment:

Oh, I should add that by decreasing number of workers to 4 or 8 the problem 
disappeared, at least to the extent when I wasn't able to reproduce it on any 
environment.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-25 Thread Tomáš Bouda

Tomáš Bouda  added the comment:

It's very difficult to reproduce.

In this example to get stuck on 3.6/OSX I need to attach debugger. However, the 
freeze happens regardless of print/logging, even def f(): pass can get stuck. 
os.write() made no difference and frozen, as well, as I've just tried.

It might be even possible that there's another problem in debugger itself, 
however on 2.7/RHEL the actual (production) code is embedded in unittest with 
"discover" mode, run from shell, no debugger attached. 

I couldn't reproduce it today morning. Later on afternoon in another script it 
occured 2 times in a row. In the past months, I have seen the problem under 
various conditions. Originally, I used ProcessPoolExecutor before, where it 
happened rather often, so I rewrote the code to use Pool directly, the problem 
is rare, now, but still occurs.

The production code has many variants including logging/custom prints/no prints 
at all, from time to time it happens regardless of anything else. However, 
there's also a heavy load and high OS resource demand (tens of workers, tens of 
GBs read/allocated, many/explicit calls to GC, etc.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-25 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

I couldn't reproduce with Python 3.6.5 on Ubuntu 18.04.

Does it happen if you reduce logging?  Or if you replace f() with:

def f(i):
os.write(1, "{}\n".format(i).encode())

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-24 Thread Brett Cannon


Change by Brett Cannon :


--
nosy: +davin, pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-23 Thread Tomáš Bouda

Tomáš Bouda  added the comment:

After more digging, I found that the following happens:

popen_fork.py -> _launch(self, process_obj) -> self.pid = os.fork()

When I let process (both child and parent) print resulting pid, on freezing I 
can see:
a) 50-times pid > 0
b) 49-times pid == 0

That means the parent is aware of 50 children, while only 49 of them get to the 
next line. Not sure if the one remaining process crashes on segfault, but 
parent apparently hangs later in os.waitpid() on this valid pid of the missing 
child.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34781] infinite waiting in multiprocessing.Pool

2018-09-23 Thread Tomáš Bouda

New submission from Tomáš Bouda :

I have encountered a possible bug inside multiprocessing.Pool which behaves 
like race-condition while I don't believe it is a typical one.

Simply put, Pool from time to time freezes. It is occasional and hard to 
reproduce, but e.g. unit-tests running 3/day freeze several times a week.

We are using Pool heavily in our applications. Usually tens of workers and 
heavy load for each one of them. This production environment is using Python 
2.7 (RHEL) and custom build, etc. However, I reproduced the same behavior in 
Python 3.6 (OSX) on my local machine.

When I run the following script like 20x, I get one or two frozen instances. 
You may notice in the output that ForkPoolWorker-42 never calls self.run(). The 
application than freezes as-is since it is probably waiting for the process.

It is easier to reproduce the behavior using debugger (PyCharm-Pro in my case), 
however, in our production environment there is just clean run, the bug occurs 
more often since multiprocessing is used quite a lot in there.

Thanks,
Tomas


--- My script:

import logging
from multiprocessing.pool import Pool
from multiprocessing.util import log_to_stderr

def f(i):
print(i)

log_to_stderr(logging.DEBUG)

pool = Pool(50)
pool.map(f, range(2))
pool.close()
pool.join()

--- Output:

[DEBUG/MainProcess] created semlock with handle 9
[DEBUG/MainProcess] created semlock with handle 10
[DEBUG/MainProcess] created semlock with handle 13
[DEBUG/MainProcess] created semlock with handle 14
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-1] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-2] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-3] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-4] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-5] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-6] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-7] child process calling self.run()
[INFO/ForkPoolWorker-9] child process calling self.run()
[INFO/ForkPoolWorker-10] child process calling self.run()
[INFO/ForkPoolWorker-8] child process calling self.run()
[INFO/ForkPoolWorker-12] child process calling self.run()
[INFO/ForkPoolWorker-13] child process calling self.run()
[INFO/ForkPoolWorker-11] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-14] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-15] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-16] child process calling self.run()
[INFO/ForkPoolWorker-17] child process calling self.run()
[INFO/ForkPoolWorker-18] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-19] child process calling self.run()
[INFO/ForkPoolWorker-20] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-21] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-22] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-23] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-24] child process calling self.run()
[INFO/ForkPoolWorker-25] child process calling self.run()
[INFO/ForkPoolWorker-26] child process calling self.run()
[INFO/ForkPoolWorker-27] child process calling self.run()
[INFO/ForkPoolWorker-28] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-29] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-30] child process calling self.run()
[INFO/ForkPoolWorker-31] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-32] child process calling self.run()
[DEBUG/MainProcess] added worker
[DEBUG/MainProcess] added worker
[INFO/ForkPoolWorker-33] child process calling self.run()
[INFO/ForkPoolWorker-34] child process calling self.run()
[INFO/ForkPoolWorker-35] child process calling self.run()
[INFO/ForkPoolWorker-36] child process calling self.run()
[INFO/ForkPoolWorker-37] child process calling self.run()
[INFO/ForkPoolWorker-38] child process calling self.run()
[DEBUG/MainProcess]