[ 
https://issues.apache.org/jira/browse/MESOS-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630683#comment-15630683
 ] 

QIHANG CHEN commented on MESOS-2115:
------------------------------------

Is this issue fixed? I wonder if there's any documentations on how to set the 
correct configurations to enable slave recovery for containerized mesos-slave? 

I'm using the latest release 1.0.2-rc1 of the mesos-slave container and the 
slave recovery still failed when I try to restart the `mesos-slave container`


> Improve recovering Docker containers when slave is contained
> ------------------------------------------------------------
>
>                 Key: MESOS-2115
>                 URL: https://issues.apache.org/jira/browse/MESOS-2115
>             Project: Mesos
>          Issue Type: Epic
>          Components: docker
>            Reporter: Timothy Chen
>            Assignee: Timothy Chen
>              Labels: docker
>             Fix For: 0.23.0
>
>
> Currently when docker containerizer is recovering it checks the checkpointed 
> executor pids to recover which containers are still running, and remove the 
> rest of the containers from docker ps that isn't recognized.
> This is problematic when the slave itself was in a docker container, as when 
> the slave container dies all the forked processes are removed as well, so the 
> checkpointed executor pids are no longer valid.
> We have to assume the docker containers might be still running even though 
> the checkpointed executor pids are not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to