New submission from STINNER Victor <[email protected]>:
On Windows, the communicate() method of subprocess.Popen is implemented with
threads calling:
def _readerthread(self, fh, buffer):
buffer.append(fh.read())
fh.close()
where fh is one the Popen pipes: popen.stdout or popen.stderr.
For stdout=PIPE and/or stderr=PIPE, Popen._get_handles() creates a pipe with
_winapi.CreatePipe(None, 0).
The fh.read() calls only completes when the write end of the pipe is closed:
when all open handles of the write end of the pipe are closed.
When using Popen(stdout=PIPE), Python uses the pipe as stdout:
startupinfo.dwFlags |= _winapi.STARTF_USESTDHANDLES
startupinfo.hStdInput = p2cread
startupinfo.hStdOutput = c2pwrite
startupinfo.hStdError = errwrite
So far so good, it works well when Python spawns a single process.
--
Problems arise when the child process itself spawns new processes. In that
case, the stdout pipe is inherited by other processes.
Popen.communicate() will block until all processes close the pipe.
This behavior is surprising for the following pattern:
------------------
try:
stdout, stderr = popen.communicate(timeout=5.0)
except subprocess.TimeoutExpired:
popen.kill()
stdout, stderr = popen.communicate()
------------------
I would expect that the second communicate() call completes immediately since
we just killed the process: Windows knows that child process is dead, Python
knows that Popen object has been kill, but fh.read() continues to block (until
all processes closed the pipe).
I would expect communicate() to complete immediately.
The problem is that fh.read() only returns once all data is returned at once.
If fh.read() call is cancelled somehow, communicate() would just return empty
data: we loose all data.
--
Eryk Sun proposed two solutions:
https://bugs.python.org/issue37531#msg350246
"For Windows, subprocess could have a _read_all(file) method that special cases
a pipe. The read loop for a pipe would check whether the child has exited. Then
call _winapi.PeekNamedPipe on the handle (from get_osfhandle), and do a raw
read of the available bytes. If the child has exited or PeekNamedPipe fails
(EPIPE), then break, join the partial reads, decode and translate newlines if
it's text mode, and return."
and:
https://bugs.python.org/issue37531#msg350329
"Alternatively, instead of special casing the file type and spinning on
PeekNamedPipe, the workaround could be based on a multiple-object wait that
includes the child process handle. In this case, communicate() would always
call _communicate in Windows, regardless of the timeout or number of pipes --
because simplistically calling either self.stdout.read() or self.stderr.read()
could hang.
The implementation would be moderately complicated. If we stop waiting on the
reader threads because the process exited, we can give the threads a short time
to finish reading and close the files -- maybe 250 ms is enough. But if they
haven't exited at this time, we can't simply raise a TimeoutExpired exception
if the call hasn't actually timed out. To finish the _communicate call, we
would have to cancel the pending reads and join() the threads.
We can force a reader thread to exit by canceling the read via WINAPI
CancelIoEx. However, _readerthread has to be modified to support this. It could
be implemented as a loop that calls _winapi.ReadFile to read the output in
chunks that get appended to the buffer list. The read loop would break for
either ERROR_BROKEN_PIPE or ERROR_OPERATION_ABORTED (canceled).
The final step in _communicate would be to concatenate the partial reads. If
it's text mode, it would also have to decode the bytes and translate newlines.
The POSIX implementation of _communicate has to do this, so we already have a
_translate_newlines method for this case.
Note that _winapi.WaitForMultipleObjects is interruptible in the main thread
via Ctrl+C, which is a bonus improvement since Thread.join() can't be
interrupted in Windows."
--
This issue has been discussed in bpo-37531 about regrtest:
https://bugs.python.org/issue37531#msg350181
It impacts the Python test suite when a test uses multiprocessing (for example)
which spawns new processes.
----------
components: Library (Lib)
messages: 352673
nosy: vstinner
priority: normal
severity: normal
status: open
title: subprocess: On Windows, Popen.kill() + Popen.communicate() is blocking
until all processes using the pipe close the pipe
versions: Python 3.9
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue38207>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com