[jira] [Commented] (MESOS-5544) Support running Mesos agent in a Docker container.
[ https://issues.apache.org/jira/browse/MESOS-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15988902#comment-15988902 ] Deshi Xiao commented on MESOS-5544: --- i think this feature is implemented. > Support running Mesos agent in a Docker container. > -- > > Key: MESOS-5544 > URL: https://issues.apache.org/jira/browse/MESOS-5544 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu > > Currently, this does not work if one tries to use Mesos containerizer. > The main problem is that we want to make sure the executor is not killed when > agent crashes. So we have to use --pid=host so that the agent is in the host > pid namespace. > But that is not sufficient, Docker daemon will put agent into all cgroups > available on the host. We need to make sure we migrate the executor pid out > of those cgroups so that when agent crashes, executors are not killed. > Also, when start the agent container, volumes need to be setup properly so > that any mounts under agent's work_dir will be propagate back to the host > mount table. This is to make sure we can recover those mounts after agent > restarts. This is also true for those mounts that are needed by some isolator > (e.g., network/cni isolator). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-5544) Support running Mesos agent in a Docker container.
[ https://issues.apache.org/jira/browse/MESOS-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15947143#comment-15947143 ] Deshi Xiao commented on MESOS-5544: --- anyone can summary this feature's status? > Support running Mesos agent in a Docker container. > -- > > Key: MESOS-5544 > URL: https://issues.apache.org/jira/browse/MESOS-5544 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu > > Currently, this does not work if one tries to use Mesos containerizer. > The main problem is that we want to make sure the executor is not killed when > agent crashes. So we have to use --pid=host so that the agent is in the host > pid namespace. > But that is not sufficient, Docker daemon will put agent into all cgroups > available on the host. We need to make sure we migrate the executor pid out > of those cgroups so that when agent crashes, executors are not killed. > Also, when start the agent container, volumes need to be setup properly so > that any mounts under agent's work_dir will be propagate back to the host > mount table. This is to make sure we can recover those mounts after agent > restarts. This is also true for those mounts that are needed by some isolator > (e.g., network/cni isolator). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-5544) Support running Mesos agent in a Docker container.
[ https://issues.apache.org/jira/browse/MESOS-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15715852#comment-15715852 ] Jie Yu commented on MESOS-5544: --- Systemd might still be problematic (esp. the delegate part). Patches are needed currently around cgroups. See my branch. - Jie > Support running Mesos agent in a Docker container. > -- > > Key: MESOS-5544 > URL: https://issues.apache.org/jira/browse/MESOS-5544 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu > > Currently, this does not work if one tries to use Mesos containerizer. > The main problem is that we want to make sure the executor is not killed when > agent crashes. So we have to use --pid=host so that the agent is in the host > pid namespace. > But that is not sufficient, Docker daemon will put agent into all cgroups > available on the host. We need to make sure we migrate the executor pid out > of those cgroups so that when agent crashes, executors are not killed. > Also, when start the agent container, volumes need to be setup properly so > that any mounts under agent's work_dir will be propagate back to the host > mount table. This is to make sure we can recover those mounts after agent > restarts. This is also true for those mounts that are needed by some isolator > (e.g., network/cni isolator). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5544) Support running Mesos agent in a Docker container.
[ https://issues.apache.org/jira/browse/MESOS-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438587#comment-15438587 ] Qian Zhang commented on MESOS-5544: --- [~jieyu], Can you please let me know why the executor will be killed when the agent is crashes (even the agent is running in a Docker container with --pid=host)? I thought if the executor is launched by a framework with checkpoint enabled, it will be still there when the agent crashes. > Support running Mesos agent in a Docker container. > -- > > Key: MESOS-5544 > URL: https://issues.apache.org/jira/browse/MESOS-5544 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu > > Currently, this does not work if one tries to use Mesos containerizer. > The main problem is that we want to make sure the executor is not killed when > agent crashes. So we have to use --pid=host so that the agent is in the host > pid namespace. > But that is not sufficient, Docker daemon will put agent into all cgroups > available on the host. We need to make sure we migrate the executor pid out > of those cgroups so that when agent crashes, executors are not killed. > Also, when start the agent container, volumes need to be setup properly so > that any mounts under agent's work_dir will be propagate back to the host > mount table. This is to make sure we can recover those mounts after agent > restarts. This is also true for those mounts that are needed by some isolator > (e.g., network/cni isolator). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5544) Support running Mesos agent in a Docker container.
[ https://issues.apache.org/jira/browse/MESOS-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395029#comment-15395029 ] Lei Xu commented on MESOS-5544: --- I've containerized mesos and running well without network namespace. > Support running Mesos agent in a Docker container. > -- > > Key: MESOS-5544 > URL: https://issues.apache.org/jira/browse/MESOS-5544 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu > > Currently, this does not work if one tries to use Mesos containerizer. > The main problem is that we want to make sure the executor is not killed when > agent crashes. So we have to use --pid=host so that the agent is in the host > pid namespace. > But that is not sufficient, Docker daemon will put agent into all cgroups > available on the host. We need to make sure we migrate the executor pid out > of those cgroups so that when agent crashes, executors are not killed. > Also, when start the agent container, volumes need to be setup properly so > that any mounts under agent's work_dir will be propagate back to the host > mount table. This is to make sure we can recover those mounts after agent > restarts. This is also true for those mounts that are needed by some isolator > (e.g., network/cni isolator). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-5544) Support running Mesos agent in a Docker container.
[ https://issues.apache.org/jira/browse/MESOS-5544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15323818#comment-15323818 ] Jie Yu commented on MESOS-5544: --- Worked on a prototype here: https://github.com/jieyu/mesos/tree/agent_in_docker Will update with a docker image soon > Support running Mesos agent in a Docker container. > -- > > Key: MESOS-5544 > URL: https://issues.apache.org/jira/browse/MESOS-5544 > Project: Mesos > Issue Type: Improvement >Reporter: Jie Yu > > Currently, this does not work if one tries to use Mesos containerizer. > The main problem is that we want to make sure the executor is not killed when > agent crashes. So we have to use --pid=host so that the agent is in the host > pid namespace. > But that is not sufficient, Docker daemon will put agent into all cgroups > available on the host. We need to make sure we migrate the executor pid out > of those cgroups so that when agent crashes, executors are not killed. > Also, when start the agent container, volumes need to be setup properly so > that any mounts under agent's work_dir will be propagate back to the host > mount table. This is to make sure we can recover those mounts after agent > restarts. This is also true for those mounts that are needed by some isolator > (e.g., network/cni isolator). -- This message was sent by Atlassian JIRA (v6.3.4#6332)