[
https://issues.apache.org/jira/browse/MESOS-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jie Yu updated MESOS-5544:
--------------------------
Description:
Currently, this does not work if one tries to use Mesos containerizer.
The main problem is that we want to make sure the executor is not killed when
agent crashes. So we have to use --pid=host so that the agent is in the host
pid namespace.
But that is not sufficient, Docker daemon will put agent into all cgroups
available on the host. We need to make sure we migrate the executor pid out of
those cgroups so that when agent crashes, executors are not killed.
Also, when start the agent container, volumes need to be setup properly so that
any mounts under agent's work_dir will be propagate back to the host mount
table. This is to make sure we can recover those mounts after agent restarts.
This is also true for those mounts that are needed by some isolator (e.g.,
network/cni isolator).
was:
Currently, this does not work if one tries to use Mesos containerizer.
The main problem is that we want to make sure the executor is not killed when
agent crashes. So we have to use --pid=host so that the agent is in the host
pid namespace.
But that is not sufficient, Docker daemon will put agent into all cgroups
available on the host. We need to make sure we migrate the executor pid out of
those cgroups so that when agent crashes, executors are not killed.
> Support running Mesos agent in a Docker container.
> --------------------------------------------------
>
> Key: MESOS-5544
> URL: https://issues.apache.org/jira/browse/MESOS-5544
> Project: Mesos
> Issue Type: Improvement
> Reporter: Jie Yu
>
> Currently, this does not work if one tries to use Mesos containerizer.
> The main problem is that we want to make sure the executor is not killed when
> agent crashes. So we have to use --pid=host so that the agent is in the host
> pid namespace.
> But that is not sufficient, Docker daemon will put agent into all cgroups
> available on the host. We need to make sure we migrate the executor pid out
> of those cgroups so that when agent crashes, executors are not killed.
> Also, when start the agent container, volumes need to be setup properly so
> that any mounts under agent's work_dir will be propagate back to the host
> mount table. This is to make sure we can recover those mounts after agent
> restarts. This is also true for those mounts that are needed by some isolator
> (e.g., network/cni isolator).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)