[jira] [Commented] (MESOS-3028) If mesos_slave gets a SIGUSR1, frameworks aren't completely shutdown

Benjamin Mahler (JIRA) Fri, 10 Jul 2015 17:32:14 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623125#comment-14623125
 ]


Benjamin Mahler commented on MESOS-3028:
----------------------------------------

I'm not too familiar with Thermos, but it looks as though it uses 
[setsid(2)|http://man7.org/linux/man-pages/man2/setsid.2.html] \[1\]. This 
makes it problematic when not using cgroups isolation to jail the processes. 
Since you're not providing an {{--isolation}} flag, we default to using a POSIX 
compatible process launcher, which uses a best-effort 
[killtree|https://github.com/apache/mesos/blob/3073bd4e6fc119875fef22b364872056ef97efd3/3rdparty/libprocess/3rdparty/stout/include/stout/os/killtree.hpp#L58].

[~idownes] [~jieyu] I found it surprising that on Linux, we default to a 
PosixLauncher rather than a LinuxLauncher \[2\]. Any reason for this? Or is 
this a bug?

\[1\] 
https://github.com/apache/aurora/blob/827b9abea48babe53ad5b2c521757c60f04c6dfc/src/main/python/apache/thermos/core/process.py#L327.
\[2\] 
https://github.com/apache/mesos/blob/3073bd4e6fc119875fef22b364872056ef97efd3/src/slave/containerizer/mesos/containerizer.cpp#L149

> If mesos_slave gets a SIGUSR1, frameworks aren't completely shutdown
> --------------------------------------------------------------------
>
>                 Key: MESOS-3028
>                 URL: https://issues.apache.org/jira/browse/MESOS-3028
>             Project: Mesos
>          Issue Type: Bug
>          Components: framework, slave
>            Reporter: Brian Brazil
>
> See AURORA-1388 for full details.
> I sent a SIGUSR1 to a mesos_slave and the executor running on it a little bit 
> of time to do things, however it then appears that the executor was killed - 
> but not any of the children.
> This is a problem as it means executors don't  have enough time to shutdown 
> gracefully when a mesos_slave is being drained for maintenance, and that 
> processes are left lying around using untracked resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-3028) If mesos_slave gets a SIGUSR1, frameworks aren't completely shutdown

Reply via email to