[ 
https://issues.apache.org/jira/browse/MESOS-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010288#comment-14010288
 ] 

Timothy St. Clair commented on MESOS-1421:
------------------------------------------

IMHO, the biggest thing it would do is to separate out the concerns in order to:
 + simplify the logic
 + enable fault tolerant recovery for running workers.   

re: namespaces, the parent that unshares is considered the init process.  If it 
goes down, all children, mounts, etc. should be removed by the kernel. 

> Breaking out sub-process tracking from the slave.
> -------------------------------------------------
>
>                 Key: MESOS-1421
>                 URL: https://issues.apache.org/jira/browse/MESOS-1421
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Timothy St. Clair
>
> I had recently treaded through the cgroups and tracking code to enable 
> systemd support and I was struck by the amount of work the slave has to do.  
> In other grid systems, the slave typically forks a worker whose job is child 
> process tracking, cleanup, etc.   
> In the future, if the full extent of namespaces are leveraged, and the init 
> process goes down, then the children go down.  This would alleviate that 
> potential problem, but it may be an architectural change.   
> Not to mention, it's a clean separation of concerns.  
> e.g. 
> slave<>worker<>executor 
> in condor-speak:
> startd<>starter<> job
> http://research.cs.wisc.edu/htcondor/manual/v8.1/3_1Introduction.html#9108
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to