Common pattern I've used is to wait a bit, then send a kill signal.
Sent from my iPhone 6 (this put here so you know I have one)
> On Aug 11, 2017, at 5:44 AM, Victor Stinner <victor.stin...@gmail.com> wrote:
> I'm working on reducing the failure rate of Python CIs (Travis CI,
> AppVeyor, buildbots). For that, I'm trying to reduce test side effects
> using "environment altered" warnings. This week, I worked on
> support.reap_children() which detects leaked child processes (usually
> created with os.fork()).
> I found a bug in the socketserver module: it waits for child processes
> completion, but only in non-blocking mode. If a child process takes
> too long, the server will never reads its exit status and so the
> server leaks "zombie processes". Leaking processes can increase the
> memory usage, spawning new processes can fail, etc.
> => http://bugs.python.org/issue31151
> I changed the code to call waitpid() in blocking mode on each child
> process on server_close(), to ensure that all children completed when
> on server close:
> After pushing my change, I'm not sure anymore if it's a good idea.
> There is a risk that server_close() blocks if a child is stuck on a
> socket recv() or send() for some reasons.
> Should we relax the code by waiting a few seconds (problem: hardcoded
> timeouts are always a bad idea), or *terminate* processes (SIGKILL on
> UNIX) if they don't complete fast enough?
> I don't know which applications use socketserver. How I can test if it
> breaks code in the wild?
> At least, I didn't notice any regression on Python CIs.
> Well, maybe the change is ok for the master branch. But I would like
> your opinion because now I would like to backport the fix to 2.7 and
> 3.6 branches. It might break some applications.
> If we *cannot* backport such change to 2.7 and 3.6 because it changes
> the behaviour, I will fix the bug in test_socketserver.py instead.
> What do you think?
> Python-Dev mailing list
Python-Dev mailing list