[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2021-05-28 Thread Kyle Stanley


Kyle Stanley  added the comment:

Thanks for closing up the issue, Victor :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2021-05-26 Thread STINNER Victor


STINNER Victor  added the comment:

"test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() 
fails with OSError: [Errno 9] Bad file descriptor"

I didn't see this failure recently, I close the issue. Since changes were 
pushed, I mark the issue as fixed.

If someone has ideas to enhance the code, I suggest to open a new more specific 
issue. I consider the initial issue (buildot failure) as fixed.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-29 Thread Thomas Moreau


Thomas Moreau  added the comment:

I did GH 19788 with a few modifications. There is only one lock that seems to 
mater for the perf, and I actually added one other (the one in _python_exit, 
which necessitate another bug fix for fork context).

I did not benchmark to see if it was worth it in term of perf.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-29 Thread Thomas Moreau


Change by Thomas Moreau :


--
pull_requests: +19111
pull_request: https://github.com/python/cpython/pull/19788

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-29 Thread Thomas Moreau


Thomas Moreau  added the comment:

I think this is a reasonable way to move on.Some of the locks can probably be 
removed but this needs careful investigation and in the mean time, it hinders 
everyone. Thanks victor for the fast fix up! 

To me, an interesting observation is that the failure seems to only happen when 
the executor is in broken state. If we can find a way to adapt the behavior to 
be more conservative on these states (which are not impacting perf) that would 
be nice.

I will try to look at it today and see if I can remove some of the locks while 
still not failing with Victor's patch. We can probably remove the lock on 
`clear` safely. For the others, it might be more complex.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-29 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

I looked at the change and it seemed ok to me. Perhaps Thomas can give it a 
look too.

--
assignee: aeros -> vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Kyle Stanley


Kyle Stanley  added the comment:

Also, thanks to Victor for coming up with an interim solution, and for the 
feedback from Antoine and Thomas. GH-19760 is a significant improvement from my 
initial proposal in GH-19751.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Kyle Stanley


Kyle Stanley  added the comment:

I am a bit concerned about the performance implication of accessing the thread 
wakeup through a lock in the call queue, but for now I think it's reasonable to 
address the buildbot failure with a locking solution while we try to find a 
better one.

I'm not certain if we'll be able to find one that correctly addresses the 
failures with zero additional locking, but I think we may be able to reduce the 
number of times we use the lock compared to the implementation in GH-19760. 
I'll spend some time exploring that as I find the time to, and report back with 
any significant findings.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Dong-hee Na


Change by Dong-hee Na :


--
nosy: +corona10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread STINNER Victor


STINNER Victor  added the comment:

I'm still getting more and more buildbot emails about test_concurrent_futures, 
so I merged my PR 19760 to fix buildbots.

Please revert or modify my PR 19760 if you have a better approach, but please 
check that test_killed_child() and ProcessPoolForkExecutorDeadlockTest tests 
don't fail with my msg367463 patch.

I would still appreciated a post-commit review of my change, since I don't know 
well concurrent.futures code :

bpo-39995: Fix concurrent.futures _ThreadWakeup (GH-19760)
https://github.com/python/cpython/commit/a4dfe8ede5a37576e17035dccfe109ba7752237e

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset a4dfe8ede5a37576e17035dccfe109ba7752237e by Victor Stinner in 
branch 'master':
bpo-39995: Fix concurrent.futures _ThreadWakeup (GH-19760)
https://github.com/python/cpython/commit/a4dfe8ede5a37576e17035dccfe109ba7752237e


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread STINNER Victor


STINNER Victor  added the comment:

> With my msg367463 patch (add sleep), test_cancel_futures() fails.

The test uses sleep() as a synchronization primitive:

   executor.submit(time.sleep, .1).

That's bad, but it doesn't *have to* be fixed now. My msg367463 patch adds an 
artifical sleep: the test looks fine in practice. I prefer to wait until it 
fails on a buildbot worker before spending time to make the test more reliable.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread STINNER Victor


STINNER Victor  added the comment:

Thomas Moreau: "One solution would be to use the `self._shutdown_lock` from the 
executor to protect the call to `close` in `terminate_broken` and the call to 
`self._thread_wakeup.wakeup` in `shutdown`. That way, the lock is only acquired 
at critical points without being used all the time. This could also be done by 
adding `lock=True/False` to only lock the potentially dangerous calls."

I wrote a conservative PR 19760 which always lock 
ProcessPoolExecutor._shutdown_lock while accessing _ThreadWakeup.

PR 19760 fix test_killed_child(): it doesn't fail anymore, even with my 
msg367463 patch (add sleep).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread STINNER Victor


STINNER Victor  added the comment:

With my msg367463 patch (add sleep), test_cancel_futures() fails. Example:

==
FAIL: test_cancel_futures 
(test.test_concurrent_futures.ProcessPoolForkProcessPoolShutdownTest)
--
Traceback (most recent call last):
  File "/home/vstinner/python/master/Lib/test/test_concurrent_futures.py", line 
353, in test_cancel_futures
self.assertTrue(len(cancelled) >= 35, msg=f"{len(cancelled)=}")
AssertionError: False is not true : len(cancelled)=0

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +19082
pull_request: https://github.com/python/cpython/pull/19760

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

With the same traceback and error message?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread STINNER Victor


STINNER Victor  added the comment:

Antoine Pitrou: "How about the following (untested): (...)"

Using Antoine's patch, test_killed_child() still fails (I used my msg367463 
patch to make the failure more likely).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread STINNER Victor


Change by STINNER Victor :


--
pull_requests: +19080
pull_request: https://github.com/python/cpython/pull/19758

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Kyle Stanley


Kyle Stanley  added the comment:

I decided to close PR-19751. Both because it does not correctly address the 
race condition (due to an oversight on my part) and it would add substantial 
overhead to _ThreadWakeup. Instead, I agree that we should explore a 
non-locking solution.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Kyle Stanley


Kyle Stanley  added the comment:

Oops, it seems that I opened PR-19751 a bit preemptively. When I get the 
chance, I'll see if Antoine's implementation can address the failures and do 
some comparisons.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Kyle Stanley


Change by Kyle Stanley :


--
pull_requests: +19072
pull_request: https://github.com/python/cpython/pull/19751

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

How about the following (untested):

diff --git a/Lib/concurrent/futures/process.py 
b/Lib/concurrent/futures/process.py
index 8e9b69a8f0..c0c2eb3032 100644
--- a/Lib/concurrent/futures/process.py
+++ b/Lib/concurrent/futures/process.py
@@ -66,23 +66,29 @@ _global_shutdown = False
 
 class _ThreadWakeup:
 def __init__(self):
-self._closed = False
 self._reader, self._writer = mp.Pipe(duplex=False)
 
 def close(self):
-if not self._closed:
-self._closed = True
-self._writer.close()
-self._reader.close()
+r, w = self._reader, self._writer
+self._reader = self._writer = None
+if r is not None:
+r.close()
+w.close()
 
 def wakeup(self):
-if not self._closed:
+try:
 self._writer.send_bytes(b"")
+except AttributeError:
+# Closed
+pass
 
 def clear(self):
-if not self._closed:
+try:
 while self._reader.poll():
 self._reader.recv_bytes()
+except AttributeError:
+# Closed
+pass
 
 
 def _python_exit():

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-28 Thread Thomas Moreau


Thomas Moreau  added the comment:

Sorry I just saw this. It seems that I introduced this regression.

One of the goal of having a `ThreadWakeup` and not a `SimpleQueue` is to avoid 
using locks that can hinder the performance of the executor. I don't remember 
the exact details but I think I did some benchmark and it was giving lower 
performances for large set of small tasks (not so sure anymore). If a fully 
synchronous method is chosen, maybe it is safer to rely on a `SimpleQueue` as 
it will be lower maintenance. If the solution proposed by aeros is chosen, the 
event can probably be replaced by a lock no? It would be easier to understand I 
guess.

>From the failures, it seems to be a race condition between `shutdown` and 
>`terminate_broken`. The race condition should not be possible in `submit` as 
>in principle, the `join_executor_internals` is only called when no new task 
>can be submitted.
One solution would be to use the `self._shutdown_lock` from the executor to 
protect the call to `close` in `terminate_broken` and the call to 
`self._thread_wakeup.wakeup` in `shutdown`. That way, the lock is only acquired 
at critical points without being used all the time. This could also be done by 
adding `lock=True/False` to only lock the potentially dangerous calls.

--
nosy: +tomMoral

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread Kyle Stanley


Kyle Stanley  added the comment:

After writing the above out and a bit of further consideration, I think it 
might make more sense to wait for the event after setting `self._closed = True` 
so that it prevents future wakeup() and clear() calls from reading/writing to 
the pipe, while still allowing ones that are currently ongoing to finish.

Thoughts?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread Kyle Stanley


Kyle Stanley  added the comment:

I can't be certain for the other failures, but I'm currently exploring a 
potential solution for addressing the `test_killed_child` failure. To me, it 
seems to be a race condition with attempting to call _ThreadWakeup.close() 
while there are still bytes being sent. IMO, we should wait until closing the 
pipe's reader and writer until it's finished sending or receiving bytes. Here's 
one way to implement that w/ threading.Event:

diff --git a/Lib/concurrent/futures/process.py 
b/Lib/concurrent/futures/process.py
index 8e9b69a8f0..9bf073fc34 100644
--- a/Lib/concurrent/futures/process.py
+++ b/Lib/concurrent/futures/process.py
@@ -68,21 +68,30 @@ class _ThreadWakeup:
 def __init__(self):
 self._closed = False
 self._reader, self._writer = mp.Pipe(duplex=False)
+# Used to ensure pipe is not closed while sending or receiving bytes
+self._not_running = threading.Event()
+# Initialize event as True
+self._not_running.set()
 
 def close(self):
 if not self._closed:
+self._not_running.wait()
 self._closed = True
 self._writer.close()
 self._reader.close()
 
 def wakeup(self):
 if not self._closed:
+self._not_running.clear()
 self._writer.send_bytes(b"")
+self._not_running.set()
 
 def clear(self):
 if not self._closed:
+self._not_running.clear()
 while self._reader.poll():
 self._reader.recv_bytes()
+self._not_running.set()


>From using Victor's method of replicating the failure with inserting a 
>`time.sleep(0.050)` in multiprocessing.Connection._send(), it appears to fix 
>the failure in test_killed_child. I believe it would also fix the other 
>failures since they appear to be caused by the same core issue, but I've been 
>unable to replicate them locally so I'm not 100% certain.

--
assignee:  -> aeros

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread Kyle Stanley


Change by Kyle Stanley :


--
nosy: +pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread Kyle Stanley


Change by Kyle Stanley :


--
nosy: +aeros

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

It seems like test_killed_child() race condition was introduced by:

commit a5cbab552d294d99fde864306632d7e511a75d3c (refs/bisect/bad)
Author: Thomas Moreau 
Date:   Sun Feb 16 19:09:26 2020 +0100

bpo-39104: Fix hanging ProcessPoolExecutor on shutdown nowait with pickling 
failure (GH-17670)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

> ERROR: test_killed_child 
> (test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest)

The patch below makes this test failure more likely:

diff --git a/Lib/multiprocessing/connection.py 
b/Lib/multiprocessing/connection.py
index 510e4b5aba..63518e55d9 100644
--- a/Lib/multiprocessing/connection.py
+++ b/Lib/multiprocessing/connection.py
@@ -370,6 +370,7 @@ class Connection(_ConnectionBase):
 def _send(self, buf, write=_write):
 remaining = len(buf)
 while True:
+time.sleep(0.050)
 n = write(self._handle, buf)
 remaining -= n
 if remaining == 0:

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

terminate_broken() method was added by:

commit 0e89076247580ba0e570c4816f0e5628a7e36e83
Author: Thomas Moreau 
Date:   Sun Mar 1 21:49:14 2020 +0100

bpo-39678: refactor queue manager thread (GH-18551)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

> It seems like Connection.close() was called while Connection._send() was 
> called. I added debug logs:

The connection was closed by terminate_broken() called by 
_ExecutorManagerThread.run() thread:

test_killed_child 
(test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest) ... 
close handle 4
  File "/home/vstinner/python/master/Lib/threading.py", line 908, in _bootstrap
self._bootstrap_inner()
  File "/home/vstinner/python/master/Lib/threading.py", line 950, in 
_bootstrap_inner
self.run()
  File "/home/vstinner/python/master/Lib/concurrent/futures/process.py", line 
313, in run
self.terminate_broken(cause)
  File "/home/vstinner/python/master/Lib/concurrent/futures/process.py", line 
456, in terminate_broken
self.join_executor_internals()
  File "/home/vstinner/python/master/Lib/concurrent/futures/process.py", line 
503, in join_executor_internals
self.thread_wakeup.close()
  File "/home/vstinner/python/master/Lib/concurrent/futures/process.py", line 
75, in close
self._writer.close()

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

ERROR: test_killed_child 
(test.test_concurrent_futures.ProcessPoolSpawnProcessPoolExecutorTest)
  (...)
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/connection.py",
 line 373, in _send
n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor

It seems like Connection.close() was called while Connection._send() was 
called. I added debug logs:

* self._handle was equal to 4 at the function entry
* self._handle was equal to None when write() was called

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

x86 Gentoo Installed with X 3.x:
https://buildbot.python.org/all/#/builders/128/builds/726

test_del_shutdown 
(test.test_concurrent_futures.ProcessPoolSpawnProcessPoolShutdownTest) ...

Warning -- Unraisable exception
Exception ignored in: .weakref_cb at 0xb5067898>
Traceback (most recent call last):
  File 
"/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.9/concurrent/futures/process.py",
 line 281, in weakref_cb
thread_wakeup.wakeup()
  File 
"/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.9/concurrent/futures/process.py",
 line 80, in wakeup
self._writer.send_bytes(b"")
  File 
"/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.9/multiprocessing/connection.py",
 line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
  File 
"/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.9/multiprocessing/connection.py",
 line 416, in _send_bytes
self._send(header + buf)
  File 
"/buildbot/buildarea/cpython/3.x.ware-gentoo-x86.installed/build/target/lib/python3.9/multiprocessing/connection.py",
 line 373, in _send
n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor

0.04s ok

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

AMD64 Fedora Stable Clang Installed 3.x:
https://buildbot.python.org/all/#/builders/127/builds/679

0:04:21 load avg: 1.29 [423/423/1] test_concurrent_futures failed (2 min 39 sec)
Warning -- threading_cleanup() failed to cleanup -1 threads (count: 0, 
dangling: 3)
Warning -- Dangling thread: <_MainThread(MainThread, started 139673296918336)>
Warning -- Dangling thread: 
Warning -- Dangling thread: <_ExecutorManagerThread(Thread-145, stopped 
139673053914880)>
Warning -- threading_cleanup() failed to cleanup 0 threads (count: 0, dangling: 
3)
Warning -- Dangling thread: <_MainThread(MainThread, started 139673296918336)>
Warning -- Dangling thread: 
Warning -- Dangling thread: <_ExecutorManagerThread(Thread-145, stopped 
139673053914880)>
/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/resource_tracker.py:216:
 UserWarning: resource_tracker: There appear to be 5 leaked semaphore objects 
to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
Warning -- multiprocessing.process._dangling was modified by 
test_concurrent_futures
Warning -- threading._dangling was modified by test_concurrent_futures
test_cancel (test.test_concurrent_futures.FutureTests) ... ok
test_cancelled (test.test_concurrent_futures.FutureTests) ... ok
test_done (test.test_concurrent_futures.FutureTests) ... ok
(...)
test_first_exception_some_already_complete 
(test.test_concurrent_futures.ThreadPoolWaitTests) ... 1.60s ok
test_pending_calls_race (test.test_concurrent_futures.ThreadPoolWaitTests) ... 
0.11s ok
test_timeout (test.test_concurrent_futures.ThreadPoolWaitTests) ... 6.11s ok
Traceback (most recent call last):
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 300, in _run_finalizers
finalizer()
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/synchronize.py",
 line 87, in _cleanup
sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 300, in _run_finalizers
finalizer()
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/synchronize.py",
 line 87, in _cleanup
sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 300, in _run_finalizers
finalizer()
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/synchronize.py",
 line 87, in _cleanup
sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 300, in _run_finalizers
finalizer()
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/synchronize.py",
 line 87, in _cleanup
sem_unlink(name)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 300, in _run_finalizers
finalizer()
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/util.py",
 line 224, in __call__
res = self._callback(*self._args, **self._kwargs)
  File 
"/home/buildbot/buildarea/3.x.cstratak-fedora-stable-x86_64.clang-installed/build/target/lib/python3.9/multiprocessing/synchronize.py",
 line 87, 

[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

> See also bpo-30966 "Add multiprocessing.SimpleQueue.close()".

I pushed a commit 1a275013d1ecc2e3778d64fda86174b2f13d6969: 
"Process.shutdown(wait=True) of concurrent.futures now closes
explicitly the result queue."

test_shutdown_deadlock_pickle() still rely on the queue to be closed 
implicitly. Queue created at:

  (...)
  File "/home/vstinner/python/master/Lib/test/test_concurrent_futures.py", 
lineno 1196
with self.executor_type(max_workers=2,
  File "/home/vstinner/python/master/Lib/concurrent/futures/process.py", lineno 
637
self._result_queue = mp_context.SimpleQueue()
  File "/home/vstinner/python/master/Lib/multiprocessing/context.py", lineno 113
return SimpleQueue(ctx=self.get_context())
  File "/home/vstinner/python/master/Lib/multiprocessing/queues.py", lineno 341
self._reader, self._writer = connection.Pipe(duplex=False)
  File "/home/vstinner/python/master/Lib/multiprocessing/connection.py", lineno 
539
c2 = Connection(fd2, readable=False)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

AMD64 Ubuntu Shared 3.x:
https://buildbot.python.org/all/#/builders/101/builds/809

==
ERROR: test_shutdown_no_wait 
(test.test_concurrent_futures.ProcessPoolForkserverProcessPoolShutdownTest)
--
Traceback (most recent call last):
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_concurrent_futures.py",
 line 542, in test_shutdown_no_wait
executor.shutdown(wait=False)
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/concurrent/futures/process.py",
 line 724, in shutdown
self._executor_manager_thread_wakeup.wakeup()
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/concurrent/futures/process.py",
 line 80, in wakeup
self._writer.send_bytes(b"")
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py",
 line 188, in send_bytes
self._check_closed()
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py",
 line 141, in _check_closed
raise OSError("handle is closed")
OSError: handle is closed

(...)
0:32:37 load avg: 1.64 Re-running test_concurrent_futures in verbose mode
(...)

==
ERROR: test_shutdown_no_wait 
(test.test_concurrent_futures.ProcessPoolForkProcessPoolShutdownTest)
--
Traceback (most recent call last):
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_concurrent_futures.py",
 line 542, in test_shutdown_no_wait
executor.shutdown(wait=False)
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/concurrent/futures/process.py",
 line 724, in shutdown
self._executor_manager_thread_wakeup.wakeup()
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/concurrent/futures/process.py",
 line 80, in wakeup
self._writer.send_bytes(b"")
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py",
 line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py",
 line 416, in _send_bytes
self._send(header + buf)
  File 
"/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py",
 line 373, in _send
n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:


New changeset 5d1f32d33ba24d0aa87235ae40207bb57778388b by Victor Stinner in 
branch 'master':
bpo-39995: Split test_concurrent_futures.test_crash() into sub-tests (GH-19739)
https://github.com/python/cpython/commit/5d1f32d33ba24d0aa87235ae40207bb57778388b


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


Change by STINNER Victor :


--
keywords: +patch
pull_requests: +19061
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/19739

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-04-27 Thread STINNER Victor


STINNER Victor  added the comment:

See also bpo-30966 "Add multiprocessing.SimpleQueue.close()".

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-03-18 Thread STINNER Victor


STINNER Victor  added the comment:

> Same bug on AMD64 FreeBSD Non-Debug 3.x:
> https://buildbot.python.org/all/#/builders/214/builds/472

Oh, test_crash failed twice, but not on the same test case:

* ERROR: test_crash 
(test.test_concurrent_futures.ProcessPoolSpawnExecutorDeadlockTest) [crash 
during func execution on worker]
* ERROR: test_crash 
(test.test_concurrent_futures.ProcessPoolForkExecutorDeadlockTest) [crash 
during func execution on worker]

The second failure was when test_concurrent_futures was re-run sequentially.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-03-18 Thread STINNER Victor


STINNER Victor  added the comment:

Same bug on AMD64 FreeBSD Non-Debug 3.x:
https://buildbot.python.org/all/#/builders/214/builds/472

==
ERROR: test_crash 
(test.test_concurrent_futures.ProcessPoolSpawnExecutorDeadlockTest) [crash 
during func execution on worker]
--
Traceback (most recent call last):
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-9e36.nondebug/build/Lib/test/test_concurrent_futures.py",
 line 1119, in test_crash
executor.shutdown(wait=True)
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-9e36.nondebug/build/Lib/concurrent/futures/process.py",
 line 721, in shutdown
self._executor_manager_thread_wakeup.wakeup()
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-9e36.nondebug/build/Lib/concurrent/futures/process.py",
 line 93, in wakeup
self._writer.send_bytes(b"")
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-9e36.nondebug/build/Lib/multiprocessing/connection.py",
 line 205, in send_bytes
self._send_bytes(m[offset:offset + size])
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-9e36.nondebug/build/Lib/multiprocessing/connection.py",
 line 416, in _send_bytes
self._send(header + buf)
  File 
"/usr/home/buildbot/python/3.x.koobs-freebsd-9e36.nondebug/build/Lib/multiprocessing/connection.py",
 line 373, in _send
n = write(self._handle, buf)
OSError: [Errno 9] Bad file descriptor

Stdout:
8.35s 

Stderr:
Warning -- threading_cleanup() failed to cleanup 0 threads (count: 0, dangling: 
3)
Dangling thread: <_MainThread(MainThread, started 34369908736)>
Dangling thread: <_ExecutorManagerThread(Thread-114, stopped daemon 
34396434432)>
Dangling thread: 

--

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39995] test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with OSError: [Errno 9] Bad file descriptor

2020-03-17 Thread STINNER Victor


Change by STINNER Victor :


--
title: test_concurrent_futures: 
ProcessPoolSpawnExecutorDeadlockTest.test_crash() fails with -> 
test_concurrent_futures: ProcessPoolSpawnExecutorDeadlockTest.test_crash() 
fails with OSError: [Errno 9] Bad file descriptor

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com