[
https://issues.apache.org/jira/browse/MESOS-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732359#comment-14732359
]
James DeFelice commented on MESOS-3363:
---------------------------------------
No I did not have that namespace enabled when I observed the problem.
--
James DeFelice
585.241.9488 (voice)
650.649.6071 (fax)
> custom executor's child process intermittently leaks to be a child of slave
> ---------------------------------------------------------------------------
>
> Key: MESOS-3363
> URL: https://issues.apache.org/jira/browse/MESOS-3363
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 0.23.0
> Environment: {code}
> vagrant@node-1:~$ uname -a
> Linux node-1 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014
> x86_64 x86_64 x86_64 GNU/Linux
> vagrant@node-1:~$ dpkg -l | grep -e mesos
> ii mesos 0.23.0-1.0.ubuntu1404
> amd64 Cluster resource manager with efficient resource isolation
> {code}
> Reporter: James DeFelice
> Labels: mesosphere
>
> I was testing a custom executor implementation that manages the life cycle of
> multiple child processes. When the executor is SIGTERM'd it sends a SIGTERM
> to each child process and then self-terminates.
> In some cases, the child processes do not die, even through the parent
> process (the custom executor) does. Instead the child procs are re-parented
> to the slave process where they continue to live on indefinitely.
> My custom executor is written in Go, and I've found a useful
> Go/Linux-specific setting that allows me to configure a signal to be sent to
> child procs upon the death of the calling thread in the parent. (see
> https://golang.org/src/syscall/exec_linux.go?s=6285:6843#1 for details). I've
> since configured the custom executor to specify that a SIGKILL be sent to all
> child procs upon termination of the executor (parent) process: child procs
> are still sent a SIGTERM upon receipt of such by the executor, but the
> SIGKILL upon executor death now acts as a fallback.
> Since implementing the above work-around I have not been able to reproduce
> the problem as previously described. This particular syscall is implemented
> in very few OS's (the Golang hack only supports Linux) so I'm not sure how
> I'd go about something similar on Windows, OS X, BSD, etc.
> It seems like mesos should take on the responsibility to ensure that when an
> executor is killed, all of it's child procs are also eventually killed. Given
> that it's an intermittent and hard to reproduce problem, I'm assuming that
> mesos *does* attempt to ensure executor child proc death, but the that the
> implementation is racy/leaky.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)