[
https://issues.apache.org/jira/browse/MESOS-2115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630683#comment-15630683
]
QIHANG CHEN commented on MESOS-2115:
------------------------------------
Is this issue fixed? I wonder if there's any documentations on how to set the
correct configurations to enable slave recovery for containerized mesos-slave?
I'm using the latest release 1.0.2-rc1 of the mesos-slave container and the
slave recovery still failed when I try to restart the `mesos-slave container`
> Improve recovering Docker containers when slave is contained
> ------------------------------------------------------------
>
> Key: MESOS-2115
> URL: https://issues.apache.org/jira/browse/MESOS-2115
> Project: Mesos
> Issue Type: Epic
> Components: docker
> Reporter: Timothy Chen
> Assignee: Timothy Chen
> Labels: docker
> Fix For: 0.23.0
>
>
> Currently when docker containerizer is recovering it checks the checkpointed
> executor pids to recover which containers are still running, and remove the
> rest of the containers from docker ps that isn't recognized.
> This is problematic when the slave itself was in a docker container, as when
> the slave container dies all the forked processes are removed as well, so the
> checkpointed executor pids are no longer valid.
> We have to assume the docker containers might be still running even though
> the checkpointed executor pids are not.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)