[ 
https://issues.apache.org/jira/browse/MESOS-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289484#comment-14289484
 ] 

Timothy Chen commented on MESOS-2115:
-------------------------------------

I'm posting the motivations and a proposed solution into this google doc, 
please take a look if anyone is interested.

https://docs.google.com/a/mesosphere.io/document/d/1_1oLHXg_aHj_fYCzsjYwox9xvIYNAKIeVjO5BFxsUGI/edit#

> Improve recovering Docker containers when slave is contained
> ------------------------------------------------------------
>
>                 Key: MESOS-2115
>                 URL: https://issues.apache.org/jira/browse/MESOS-2115
>             Project: Mesos
>          Issue Type: Epic
>          Components: docker
>            Reporter: Timothy Chen
>            Assignee: Timothy Chen
>              Labels: docker
>
> Currently when docker containerizer is recovering it checks the checkpointed 
> executor pids to recover which containers are still running, and remove the 
> rest of the containers from docker ps that isn't recognized.
> This is problematic when the slave itself was in a docker container, as when 
> the slave container dies all the forked processes are removed as well, so the 
> checkpointed executor pids are no longer valid.
> We have to assume the docker containers might be still running even though 
> the checkpointed executor pids are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to