[issue42968] multiprocessing handle leak on Windows when child process is killed during startup/unpickling

2021-01-20 Thread Eryk Sun


Eryk Sun  added the comment:

I'm not fond of the way reduction.DupHandle() expects the receiving process to 
steal the duplicated handle. I'd prefer using the resource_sharer module, like 
reduction.DupFd() does in POSIX. Except spawning is a special case, for which 
reduction.DupHandle() can take advantage of the duplicate_for_child() method of 
the popen_spawn_win32.Popen instance.

With the resource sharer, the handle still needs to be duplicated in the 
sending process. But an important difference is the resource_sharer.stop() 
method, which at least allows closing any handles that no longer need to be 
shared.

---

Proposed Changes (untested)

resource_sharer.py:

class DupHandle(object):
'''Wrapper for a handle that can be used at any time.'''
def __init__(self, handle):
dh = reduction.duplicate(handle)
def send(conn, pid):
reduction.send_handle(conn, dh, pid)
def close():
_winapi.CloseHandle(dh)
self._id = _resource_sharer.register(send, close)

def detach(self):
'''Get the handle. This should only be called once.'''
with _resource_sharer.get_connection(self._id) as conn:
return reduction.recv_handle(conn)


reduction.py:

def send_handle(conn, handle, destination_pid):
'''Send a handle over a local connection.'''
proc = _winapi.OpenProcess(_winapi.PROCESS_DUP_HANDLE, False,
destination_pid)
try:
dh = duplicate(handle, proc)
conn.send(dh)
finally:
_winapi.CloseHandle(proc)

def recv_handle(conn):
'''Receive a handle over a local connection.'''
return conn.recv()

class _DupHandle:
def __init__(self, handle):
self.handle = handle
def detach(self):
return self.handle

def DupHandle(handle):
'''Return a wrapper for a handle.'''
popen_obj = context.get_spawning_popen()
if popen_obj is not None:
return _DupHandle(popen_obj.duplicate_for_child(handle))
from . import resource_sharer
return resource_sharer.DupHandle(handle)


connection.py:

def reduce_pipe_connection(conn):
dh = reduction.DupHandle(conn.fileno())
return rebuild_pipe_connection, (dh, conn.readable, conn.writable)

def rebuild_pipe_connection(dh, readable, writable):
return PipeConnection(dh.detach(), readable, writable)

reduction.register(PipeConnection, reduce_pipe_connection)

--
components: +Library (Lib)
nosy: +eryksun
versions: +Python 3.10

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42968] multiprocessing handle leak on Windows when child process is killed during startup/unpickling

2021-01-19 Thread Daniel Grunwald


Daniel Grunwald  added the comment:

Fix idea: get_spawning_popen().pid could be used to directly copy the handle 
into the child process, thus avoiding the temporary copy in the main process.
This would help at least in our case (where we pass all connections during 
startup).

I don't know if the general case is solvable -- in general we don't know which 
process will unpickle the data, and "child process is killed" isn't the only 
possible reason why the call to rebuild_pipe_connection() might not happen 
(e.g. exception while unpickling an earlier part of the same message).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42968] multiprocessing handle leak on Windows when child process is killed during startup/unpickling

2021-01-19 Thread Daniel Grunwald


New submission from Daniel Grunwald :

Running the attached script deadlocks.
Uncommenting the `time.sleep(1)` in the script makes the deadlock disappear.

For context: our application uses multiple child processes 
(multiprocessing.Process) and uses pipes (multiprocessing.Pipe) to communicate 
with them.
If one process fails with an error, the main process will kill all other child 
processes running concurrently.
We noticed that sometimes (non-deterministically), when an error occurs soon 
after startup, the main process ends up hanging.

We expect that when we pass the writing half of a connection to a child process 
and close the connection in the main process, that we will receive EOFError if 
the child process terminates unexpectedly.
But sometimes the EOFError never comes and our application hangs.

I've reduced the problem to the script attached. With the reduced script, the 
deadlock happens reliably for me.

I've debugged this a bit, and I think this is happening because passing a 
connection to the process being started involves reduce_pipe_connection() which 
creates a copy of the handle within the main process.
When the pickled data is unpickled in the child process, it uses 
DUPLICATE_CLOSE_SOURCE to close the copy in the main process.
But if the pickled data is never unpickled by the child process, the handle 
ends up being leaked.
Thus the main process itself holds the writing half of the connection open, 
causing the recv() call on the reading half to block forever.

--
components: Windows
files: deadlock.py
messages: 385283
nosy: dgrunwald, paul.moore, steve.dower, tim.golden, zach.ware
priority: normal
severity: normal
status: open
title: multiprocessing handle leak on Windows when child process is killed 
during startup/unpickling
type: behavior
versions: Python 3.7, Python 3.8, Python 3.9
Added file: https://bugs.python.org/file49751/deadlock.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com