[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-08-23 Thread Qian Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16914034#comment-16914034
 ] 

Qian Zhang commented on MESOS-9836:
---

Master:

commit 5db1835531edd16b8a7b550a944c27b0dacafbf7
Author: Qian Zhang 
Date: Wed Aug 21 16:50:06 2019 +0800

Used cached cgroups for updating resources in Docker containerizer.
 
Review: https://reviews.apache.org/r/71335

 

1.8.x

commit ee01c8d479b34ced35ee6bd172108a128086277e
Author: Qian Zhang 
Date: Wed Aug 21 16:50:06 2019 +0800

Used cached cgroups for updating resources in Docker containerizer.
 
 Review: https://reviews.apache.org/r/71335

 

1.7.x

commit 7b01d0d35e8e17434d6f8e8840c8586565fe8d6c
Author: Qian Zhang 
Date: Wed Aug 21 16:50:06 2019 +0800

Used cached cgroups for updating resources in Docker containerizer.
 
Review: https://reviews.apache.org/r/71335

 

1.6.x

commit 1c70f29bdd270cdff8ce55bdfaab56581829017c (HEAD -> 
ci/qzhang/bp_9836_1.6.x)
Author: Qian Zhang 
Date: Wed Aug 21 16:50:06 2019 +0800

Used cached cgroups for updating resources in Docker containerizer.
 
 Review: https://reviews.apache.org/r/71335

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Assignee: Qian Zhang
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-08-21 Thread Qian Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912257#comment-16912257
 ] 

Qian Zhang commented on MESOS-9836:
---

On second thought, I would prefer Chun's solution: After obtaining the pid 
through `docker inspect`, we look into the proc filesystem and get the cgroup 
right away and cache it. And we do not need to checkpoint the cached cgroup 
since after agent restarts we can still get it through `docker inspect` and 
proc filesystem.

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to 
> 'mesos_executors.slice'
> I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to 
> 'mesos_executors.slice'
> I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to 
> 'mesos_executors.slice'
> ...
> I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to 
> 'mesos_executors.slice'
> I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to 
> 'mesos_executors.slice'
> I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to 
> 'mesos_executors.slice'
> ...
> {noformat}
> It was highly likely that the container itself exited around 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-08-20 Thread Andrei Budnik (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911305#comment-16911305
 ] 

Andrei Budnik commented on MESOS-9836:
--

Shall we deprecate the option to run a custom executor in a Docker container? 
If no one responds to our proposal in dev@ & user@ mailing lists, then we can 
safely deprecate this feature.

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to 
> 'mesos_executors.slice'
> I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to 
> 'mesos_executors.slice'
> I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to 
> 'mesos_executors.slice'
> ...
> I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to 
> 'mesos_executors.slice'
> I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to 
> 'mesos_executors.slice'
> I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to 
> 'mesos_executors.slice'
> ...
> {noformat}
> It was highly likely that the container itself exited around 06:09:35, way 
> before the docker executor detected and reported the terminal status update, 
> and then its pid was reused by 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-08-20 Thread Qian Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911293#comment-16911293
 ] 

Qian Zhang commented on MESOS-9836:
---

I think there is a case that we need to update Docker container's CPU & memory 
directly in Docker containerizer: a customer executor is launched as a Docker 
container and it launches tasks as processes in the Docker container. In this 
case every time when a new task is sent to the customer executor or a running 
task terminates, Docker containerizer needs to update the Docker container's 
CPU and memory in cgroups.

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to 
> 'mesos_executors.slice'
> I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to 
> 'mesos_executors.slice'
> I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to 
> 'mesos_executors.slice'
> ...
> I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to 
> 'mesos_executors.slice'
> I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to 
> 'mesos_executors.slice'
> I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to 
> 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-08-16 Thread Qian Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1690#comment-1690
 ] 

Qian Zhang commented on MESOS-9836:
---

{quote}As Mesos provides an option to run a Docker image as an (custom?) 
executor, it might make sense to update the Docker container's resources 
(executor+its tasks running in the Docker container) in cgroups
{quote}
[~abudnik] You are saying the `--docker_mesos_image` flag, right? I think when 
this flag is specified, the Docker executor will run in a Docker container with 
the Docker image specified via this flag, but the task will be launched by 
Docker executor in *another* Docker container because Docker executor always 
calls `docker run` to launch task as a Docker container and in Docker 
containerizer we mount the docker socket for Docker executor so it can 
communicate with the Docker daemon in agent host (see 
[here|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.cpp#L333:L339]
 for details). And another important thing is, in any cases, what 
`DockerContainerizerProcess::update` tries to update is the Docker container of 
task instead of Docker container of executor (because 
[container->containerName|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.hpp#L486]
 is always the task container's name), that is really confusing me as I 
mentioned in my previous comment.

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-08-15 Thread Andrei Budnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908250#comment-16908250
 ] 

Andrei Budnik commented on MESOS-9836:
--

{quote}
So what is the purpose of Docker containerizer's update method?
{quote}

As Mesos provides an option to run a Docker image as an (custom?) executor, it 
might make sense to update the Docker container's resources (executor+its tasks 
running in the Docker container) in cgroups. If this is the case, we probably 
should deprecate such an option? Ignoring `update` for Docker c'zer sounds like 
a good idea.

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to 
> 'mesos_executors.slice'
> I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to 
> 'mesos_executors.slice'
> I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to 
> 'mesos_executors.slice'
> ...
> I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to 
> 'mesos_executors.slice'
> I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to 
> 'mesos_executors.slice'
> I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to 
> 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-08-15 Thread Qian Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907981#comment-16907981
 ] 

Qian Zhang commented on MESOS-9836:
---

{quote}A fix would be after obtaining the pid through docker inspect, we look 
into the proc filesystem and get the cgroup right away and cache it, as well as 
other properties we originally retrieve through pid if any.
{quote}
[~chhsia0] Yeah, we could get and catch the cgroup, but we also need to recover 
it after agent restarts which might be complicated.
{quote}Probably we should leave out all cgroups not containing "docker" 
substring instead of (or in addition to) filtering [the system root 
cgroup|https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1783-L1788].
 It's ugly, hacky, and introduces a dependency on Docker's runtime.
{quote}
[~abudnik] I think a typical cgroup for Docker container is like:

 
{code:java}
/sys/fs/cgroup//docker/
{code}
Are you saying in `DockerContainerizerProcess::__update` we do not update 
anything if the cgroup does not contain "docker" substring (i.e., it is not a 
Docker container)? Yeah, it should work, but as you mentioned it will introduce 
a dependency on Docker which is unfortunate.

 

I am actually wondering why we need to update Docker container's CPU & memory 
in Docker containerizer. For the task launched via Docker containerizer, the 
method `DockerContainerizerProcess::update` will be called twice:
 # When Docker executor registers agent (see 
[here|https://github.com/apache/mesos/blob/1.8.1/src/slave/slave.cpp#L5210] for 
details): The resources passed into `DockerContainerizerProcess::update` is 
same as the container's resources, so we will not update anything (see 
[here|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.cpp#L1675:L1680]
 for details).
 # When agent receives a terminal status update for the task (see 
[here|https://github.com/apache/mesos/blob/1.8.1/src/slave/slave.cpp#L5841:L5846]
 for details): Since the container has terminated, what `docker inspect` 
returns will not contain a pid, so we will not update anything as well (see 
[here|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.cpp#L1754:L1756]
 for details).

 

The reason that we have the issue mentioned in this ticket is, between #1 and 
#2 (i.e., the Docker container is stilling running) 
`DockerContainerizerProcess::usage` is called (e.g., agent's 
/monitor/statistics endpoint is hit), internally it calls `docker inspect` 
which will return a valid pid, and we will cache the pid (see 
[here|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.cpp#L2025]
 for details). And after that but before #2, that pid may be reused by a 
process which is agent process's child, so we will overwrite agent's cgroup in 
#2.

It is Docker executor to launch task as a Docker container, so it is a bit 
weird for Docker containerizer to directly update the Docker container's 
resources in cgroups which should be the job of Docker executor. And I do not 
see why we need to support updating Docker container's resource at runtime. So 
what is the purpose of Docker containerizer's update method? To me, it should 
just do nothing.

 

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-07-25 Thread Andrei Budnik (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892844#comment-16892844
 ] 

Andrei Budnik commented on MESOS-9836:
--

A typical cgroup for Docker containers looks like:
{code:java}
/system.slice/docker-3a91c29381522918a2f2cad05583b172f415da4010bad672c21a19356aec1d69.scope
{code}
Probably we should leave out all cgroups not containing "docker" substring 
instead of (or in addition to) filtering [the system root 
cgroup|https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1783-L1788].
 It's ugly, hacky, and introduces a dependency on Docker's runtime.

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to 
> 'mesos_executors.slice'
> I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to 
> 'mesos_executors.slice'
> I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to 
> 'mesos_executors.slice'
> ...
> I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to 
> 'mesos_executors.slice'
> I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to 
> 'mesos_executors.slice'
> I0523 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-07-25 Thread Benjamin Bannier (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892622#comment-16892622
 ] 

Benjamin Bannier commented on MESOS-9836:
-

Any update on this [~gilbert]? We saw this again. This is a bad issue as it 
mainly manifests as just degraded agent performance. It could only be caught by 
monitoring of the agent cgroups. Can you please bump this up?

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>  Labels: docker, mesosphere
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to 
> 'mesos_executors.slice'
> I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to 
> 'mesos_executors.slice'
> I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to 
> 'mesos_executors.slice'
> ...
> I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to 
> 'mesos_executors.slice'
> I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to 
> 'mesos_executors.slice'
> I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to 
> 'mesos_executors.slice'
> ...
> {noformat}
> It was highly likely that the container itself exited around 06:09:35, way 
> before the docker executor detected and reported the terminal status update, 

[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.

2019-06-11 Thread Chun-Hung Hsiao (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16861255#comment-16861255
 ] 

Chun-Hung Hsiao commented on MESOS-9836:


A fix would be after obtaining the pid through {{docker inspect}}, we look into 
the proc filesystem and get the cgroup right away and cache it, as well as 
other properties we originally retrieve through pid if any.

> Docker containerizer overwrites `/mesos/slave` cgroups.
> ---
>
> Key: MESOS-9836
> URL: https://issues.apache.org/jira/browse/MESOS-9836
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Chun-Hung Hsiao
>Priority: Critical
>
> The following bug was observed on our internal testing cluster.
> The docker containerizer launched a container on an agent:
> {noformat}
> I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container 
> 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor 
> 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011
> I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to 
> '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid'
> {noformat}
> After the container was launched, the docker containerizer did a {{docker 
> inspect}} on the container and cached the pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764]
>  The pid should be slightly greater than 13716.
> The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes 
> later:
> {noformat}
> I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update 
> TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task 
> apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework 
> 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244
> {noformat}
> After receiving the terminal status update, the agent asked the docker 
> containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and 
> {{memory.soft_limit_in_bytes}} of the container through the cached pid:
>  
> [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696]
> {noformat}
> I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at 
> /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to 
> 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> I0523 06:16:17.889816 21815 docker.cpp:1937] Updated 
> 'memory.soft_limit_in_bytes' to 32MB for container 
> f69c8a8c-eba4-4494-a305-0956a44a6ad2
> {noformat}
> Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was 
> possibly because that over the 16 minutes the pid got reused:
> {noformat}
> # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz
> ...
> I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to 
> 'mesos_executors.slice'
> I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to 
> 'mesos_executors.slice'
> I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to 
> 'mesos_executors.slice'
> ...
> I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to 
> 'mesos_executors.slice'
> I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to 
> 'mesos_executors.slice'
> I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to 
> 'mesos_executors.slice'
> I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to 
> 'mesos_executors.slice'
> I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to 
> 'mesos_executors.slice'
> I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to 
> 'mesos_executors.slice'
> ...
> I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to 
> 'mesos_executors.slice'
> I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to 
> 'mesos_executors.slice'
> I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to 
> 'mesos_executors.slice'
> ...
> {noformat}
> It was highly likely that the container itself exited around 06:09:35, way 
> before the docker executor detected and reported the terminal status update, 
> and then its pid was reused by another