[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619730#comment-16619730 ] Hudson commented on YARN-8648: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14997 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14997/]) YARN-8648. Container cgroups are leaked when using docker. Contributed (jlowe: rev 2df0a8dcb3dfde15d216481cc1296d97d2cb5d43) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/TestDockerCommandExecutor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/DockerRmCommand.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/TestDockerRmCommand.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/ResourceHandlerModule.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch, YARN-8648.005.patch, > YARN-8648.006.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619670#comment-16619670 ] Jason Lowe commented on YARN-8648: -- Thanks, [~billie.rinaldi]! Looks like this is good to go then. Committing this. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch, YARN-8648.005.patch, > YARN-8648.006.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619655#comment-16619655 ] Billie Rinaldi commented on YARN-8648: -- Thanks for checking in, [~Jim_Brennan] and [~jlowe]. I don't have any issues with the new behavior of the remove operation failing when cgroups fail to be cleaned up. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch, YARN-8648.005.patch, > YARN-8648.006.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614019#comment-16614019 ] Jason Lowe commented on YARN-8648: -- Thanks for updating the patch! +1 lgtm. Waiting to hear back from [~billie.rinaldi] about the docker-in-docker use case before committing. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch, YARN-8648.005.patch, > YARN-8648.006.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613834#comment-16613834 ] Hadoop QA commented on YARN-8648: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 28s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 4s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8648 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939580/YARN-8648.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 2ec2c6408a2d 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e1b242a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21831/testReport/ | | Max. process+thread count | 465 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21831/console | | Powered by | Apache Yetus 0.8.0
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613729#comment-16613729 ] Jim Brennan commented on YARN-8648: --- [~jlowe] thanks for the review. I've made the change you suggested in patch 006. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch, YARN-8648.005.patch, > YARN-8648.006.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612788#comment-16612788 ] Jason Lowe commented on YARN-8648: -- Thanks for updating the patch! It seems a bit awkward that remove_docker_container and exec_docker_container receives an array of argument pointers and the number of pointers but then also an offset into that array to start parsing. I think it would be simpler if the functions interpreted their args parameters as-is, and it was up to the caller to pass only relevant arguments. For example, main.c could pass (argv + optind, argc - optind) for argv and argc, respectively. It's cheap, easy, and simplifies the logic in those docker functions since they don't have to deal with skipping unrelated arguments. Otherwise patch looks good to me. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch, YARN-8648.005.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612632#comment-16612632 ] Hadoop QA commented on YARN-8648: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 0s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 56s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:4b8c2b1 | | JIRA Issue | YARN-8648 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12939448/YARN-8648.005.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux ed93838db598 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1f6c454 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21826/testReport/ | | Max. process+thread count | 409 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21826/console | | Powered by | Apache Yetus 0.8.0
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612556#comment-16612556 ] Jim Brennan commented on YARN-8648: --- Thanks [~jlowe] for the review! I have addressed all of this issues you raised in patch 005. I did change it so we no longer ignore EBUSY errors. This might cause remove operations for containers in a docker-in-docker situation to be reported as failed even though the container was removed (but some of the cgroups were not). I think the only impact would be log messages to that effect. [~billie.rinaldi], are you ok with this? > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch, YARN-8648.005.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609735#comment-16609735 ] Jason Lowe commented on YARN-8648: -- Thanks for updating the patch! Should DockerRmCommand take the cgroup hierarchy or null argument in the constructor? It's a bit weird that it requires a container ID in the constructor but not the cgroup hierarchy, yet callers need to check if they need to pass the hierarchy in order to use it properly. Typically it's safer to put such things in the constructor so callers have to think about it. Do we really want to ignore EBUSY errors when trying to remove the cgroup entries? I think this is here for the docker-in-docker use-case that share the same cgroup parent, but it also suppresses useful error messages when the code tries to remove an entry and fails to do so. As written now, the patch will silently fail to remove cgroup entries that are still being used in all cases which seems less than ideal. There is already a {{validate_container_id}} in string-utils.c that the code should leverage to check if the argument is a container ID. The dir_exists check seems extraneous since the code already checks for ENOENT. As you mentioned above, the entry could be deleted after the check anyway. The code makes a system call to avoid a system call, so it won't be much of an optimization in practice. Now that optind is not being passed explicitly to exec_docker_command, do we really want exec_docker_command to examine/modify the global optind variable? What if someone wants to exec multiple docker commands consecutively with very different argc/argv values? Currently the caller would have to be aware of the fact that exec_docker_command is using the global optind to calculate argument offsets in the loop, but not the global argc/argv values, and manually fixup optind between invocations to get it to work properly. Regarding the "Don't increment optind here" comment, it would be good to elaborate a bit more on why an increment would be bad here. Otherwise the comment is simply parroting the code which isn't as helpful. If we fixup the problems with exec_docker_command and optind in the previous issue then I'm hoping there wouldn't be a need for careful optind management and comments here. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609206#comment-16609206 ] Jim Brennan commented on YARN-8648: --- This is ready for review. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608225#comment-16608225 ] Hadoop QA commented on YARN-8648: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 45s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8648 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12938970/YARN-8648.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 4633a8fb99fe 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / bf8a175 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21791/testReport/ | | Max. process+thread count | 440 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21791/console | | Powered by | Apache Yetus 0.8.0
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608212#comment-16608212 ] Jim Brennan commented on YARN-8648: --- I have uploaded a patch that adds the cgroup cleanup to the DockerRmCommand. This also includes some fixes for exec_docker_command() in container-executor.c. * No longer passes optind, which was eclipsing the global variable of the same name * Fix stack overrun error - was allocating using sizeof(char) instead of sizeof(char *) * Use optind for indexing into argv instead of assuming args start at 2. Added a wrapper function remove_docker_container() that forks and calls exec_docker_command() in the child so I could add the cgroup cleanup after it finishes. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch, YARN-8648.004.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603703#comment-16603703 ] Hadoop QA commented on YARN-8648: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 16s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 24s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 69m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8648 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12938362/YARN-8648.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 4c8465cf7fc9 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9964e33 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21760/testReport/ | | Max. process+thread count | 303 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21760/console | | Powered by | Apache Yetus 0.8.0
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603606#comment-16603606 ] Jim Brennan commented on YARN-8648: --- Put up another patch to fix the checkstyle issue. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch, > YARN-8648.003.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603269#comment-16603269 ] Hadoop QA commented on YARN-8648: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 21s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 10 unchanged - 0 fixed = 11 total (was 10) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 22s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 20s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 69m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8648 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12938265/YARN-8648.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 5003ee282ea4 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6e5ffb7 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21755/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21755/testReport/ | | Max. process+thread count | 301 (vs.
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603202#comment-16603202 ] Jim Brennan commented on YARN-8648: --- I've uploaded a patch that addresses most of issues raised by [~jlowe] - except for moving the functionality to the Docker RM command - I wanted to put up these other changes before reworking that part. I misspoke in my earlier comment - I don't think any change would be needed to DockerContainerDeletionService, because it ends up calling LinuxContainerExecutor.removeDockerContainer(), which can lookup the yarn-hierarchy. My only reservation to moving this to the DockerRmCommand is that most (if not all) arguments to Docker*Commands are actual command line arguments for the docker command. This would be an exception to that. Not sure how much that matters, because I agree this cleanup does naturally align with removing the container. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch, YARN-8648.002.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599235#comment-16599235 ] Jim Brennan commented on YARN-8648: --- {quote} I explicitly check that the directory exists before calling rmdir, so I'm not sure this is necessary, but I can add it anyway. {quote} There is a small window where it could be removed between the exist check and the rmdir, so it is necessary. I'm tempted to just remove the dir_exists() check. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599213#comment-16599213 ] Jim Brennan commented on YARN-8648: --- [~jlowe] thanks for the review! {quote}Why was the postComplete call moved in reapContainer to before the container is removed via docker? Shouldn't docker first remove its cgroups for the container before we remove ours? {quote} I was trying to preserve the order of operations. Normally postComplete is called immediately after the launchContainer() returns, and then reapContainer() is called as part of cleanupContainer() processing. So the resource handlers usually get a chance to clean up cgroups before we cleanup the container. If we do the docker cleanup first, it will delete the cgroups before the resource handler postComplete processing - it doesn't know which ones are handled by resource handlers, so it just deletes them all. Since they both are really just deleting the cgroups, I don't think the order matters that much, so I will move it back if you think that is better. {quote}Is there a reason to separate removing docker cgroups from removing the docker container? This seems like a natural extension to cleaning up after a container run by docker, and that's already covered by the reap command. The patch would remain a docker-only change but without needing to modify the container-executor interface. {quote} It is currently being done as part of the reap processing, but as a separate privileged operation. We definitely could just add this processing to the remove-docker-container processing in container-executor, but it would require adding the yarn-hierarchy as an additional argument for the DockerRmCommand. This would also require changing the DockerContainerDeletionTask() to store the yarn-hierarchy String along with the ContainerId. Despite the additional container-executor interface, I think the current approach is less code/simpler, but I'm definitely willing to rework it if you think it is a better solution. {quote}Nit: PROC_MOUNT_PATH should be a macro (i.e.: #define) or lower-cased. Similar for CGROUP_MOUNT. {quote} I will fix these. {quote}The snprintf result should be checked for truncation in addition to output errors (i.e.: result >= PATH_MAX means it was truncated) otherwise we formulate an incomplete path targeted for deletion if that somehow occurs. Alternatively the code could use make_string or asprintf to allocate an appropriately sized buffer for each entry rather than trying to reuse a manually sized buffer. {quote} I will fix this. I forgot about make_string(). {quote}Is there any point in logging to the error file that a path we want to delete has already been deleted? This seems like it will just be noise, especially if systemd or something else is periodically cleaning some of these empty cgroups. {quote} I'll remove it - was nice while debugging, but not needed. {quote}Related to the previous comment, the rmdir result should be checked for ENOENT and treat that as success. {quote} I explicitly check that the directory exists before calling rmdir, so I'm not sure this is necessary, but I can add it anyway. {quote}Nit: I think lineptr should be freed in the cleanup label in case someone later adds a fatal error that jumps to cleanup. {quote} Will do. Thanks again for the review! > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates >
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596699#comment-16596699 ] Jason Lowe commented on YARN-8648: -- Thanks for the patch! Why was the postComplete call moved in reapContainer to before the container is removed via docker? Shouldn't docker first remove its cgroups for the container before we remove ours? Is there a reason to separate removing docker cgroups from removing the docker container? This seems like a natural extension to cleaning up after a container run by docker, and that's already covered by the reap command. The patch would remain a docker-only change but without needing to modify the container-executor interface. Nit: PROC_MOUNT_PATH should be a macro (i.e.: #define) or lower-cased. Similar for CGROUP_MOUNT. The snprintf result should be checked for truncation in addition to output errors (i.e.: result >= PATH_MAX means it was truncated) otherwise we formulate an incomplete path targeted for deletion if that somehow occurs. Alternatively the code could use make_string or asprintf to allocate an appropriately sized buffer for each entry rather than trying to reuse a manually sized buffer. Is there any point in logging to the error file that a path we want to delete has already been deleted? This seems like it will just be noise, especially if systemd or something else is periodically cleaning some of these empty cgroups. Related to the previous comment, the rmdir result should be checked for ENOENT and treat that as success. Nit: I think lineptr should be freed in the cleanup label in case someone later adds a fatal error that jumps to cleanup. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596345#comment-16596345 ] Jim Brennan commented on YARN-8648: --- Looks like this is ready for review. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > Attachments: YARN-8648.001.patch > > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595740#comment-16595740 ] genericqa commented on YARN-8648: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 20s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 10 unchanged - 0 fixed = 11 total (was 10) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 11s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 8s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8648 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12937524/YARN-8648.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux add37685e48d 4.4.0-133-generic #159-Ubuntu SMP Fri Aug 10 07:31:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 3e18b95 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21706/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21706/testReport/ | | Max. process+thread count | 399 (vs.
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587681#comment-16587681 ] Eric Badger commented on YARN-8648: --- IMO I like the idea of actually dealing with this problem via proposal 1, but it seems like a much bigger effort that has many corner cases and realistically is going to require a whole new resource module controller. Therefore, I think we shouldn't let perfect get in the way of good and move forward with the more simple approach of removing the cgroups via the container-executor. I don't _like_ this solution, but I think it is a stopgap until we are able to really fix the underlying issues. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586076#comment-16586076 ] Eric Yang commented on YARN-8648: - [~Jim_Brennan] {quote} I think this is mitigated if we use the "cgroup" section of the container-exectutor.cfg to constrain it. This is currently used to enable updating params, but I think it could be used for this as well.It already defines the CGROUPS_ROOT (e.g., /sys/fs/cgroup), and the YARN_HIERARCHY (e.g, hadoop-yarn). We could either add another config parameter to define the list of hierarchies to clean up (e.g, cpuset, freezer, hugetlb, etc...), or we can parse /proc/mounts to determine the full list. I think it's safer to add the config parameter. {quote} The proposal looks good. As long as the default list can match docker's cgroup usage when docker is enabled instead of being empty list. I think this solution can work. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586060#comment-16586060 ] Jim Brennan commented on YARN-8648: --- Thanks [~eyang]! My main concern about the minimal fix is the security aspect, since we will need to add an option to container-executor to tell it to delete all cgroups with a particular name as root (since docker will create them as root). I think this is mitigated if we use the "cgroup" section of the container-exectutor.cfg to constrain it. This is currently used to enable updating params, but I think it could be used for this as well. It already defines the CGROUPS_ROOT (e.g., /sys/fs/cgroup), and the YARN_HIERARCHY (e.g, hadoop-yarn). We could either add another config parameter to define the list of hierarchies to clean up (e.g, cpuset, freezer, hugetlb, etc...), or we can parse /proc/mounts to determine the full list. I think it's safer to add the config parameter. I will start working on this version unless there are objections? > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584600#comment-16584600 ] Eric Yang commented on YARN-8648: - I am in favor of minimal fix at this time. Let docker be docker seems to have possibility to introduce risk where container-executor usage is either unaccounted, or Kernel underflow occurs. When cgroup is managed separately, there is a chance that YARN and Docker don't agree on actual usage. (i.e. 2GB allocated to container, but container-executor used 5mb, and docker used 2GB.) Miscalculated number can cause Kernel underflow. If YARN uses a flat namespace like docker does, then it will be safe to let docker be docker. In today's reality, there are people out there that don't have cgroup v2 in their kernel. This means it is less likely to be a feasible option to do major change to yarn cgroup, although it would be a great option to have two years down the road to support cgroup v2. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584479#comment-16584479 ] Jim Brennan commented on YARN-8648: --- I have been experimenting with the following incomplete approach: * CGroupsHandler ** Add missing controllers to the list of supported controllers ** Add initializeAllCGroupControllers() *** Initializes all of the cgroups controllers that were not already initialized by a ResourceHandler - this is mainly creating the hierarchy (hadoop-yarn) cgroup or verifying that it is there and writable. ** Add CreateCGroupAllControllers(containerId) *** Creates the containerId cgroup under all cgroup controllers ** Add DeleteCGroupAllControllers(containerId) *** Deletes the containerId cgroup under all cgroup controllers * ResourceHandlerModule ** Add wrappers to call the above methods. * LinuxContainerExecutor ** Add calls to above methods if the runtime is Docker (would probably be better to move these to the runtime) So far I have been testing with pre-mounted cgroup hierarchies. That is, I manually created the hadoop-yarn cgroup under each controller. I've run into several problems experimenting with this approach on RHEL 7: * The hadoop-yarn cgroup under the following controllers is being deleted by the system (when I let it sit idle for a while): blkio, devices, memory, pids ** I got around this for now by just not adding pids to the list and skipping the others in the new methods. We are not leaking cgroups for these controllers. * I am still leaking cgroups under /sys/fs/cgroup/systemd ** Even if I add "systemd" as one of the supported controllers, our mount-tab parsing code does not find it because it's not really a controller. * This feels pretty hacky - it might be better to just add a new dockerCGroupResourceHandler (as I mentioned above) to do effectively the same thing - we'd have to supply the list of controllers in a config property and deal with systemd. The way things are right now we would still have to add these to the list of supported controllers, because most of the interfaces are based on a controller enum. But even moving it to a separateResourceHandler still seems hacky. * I haven't tested the mount-cgroup path yet, but I believe we would need to configure all of the controllers that we need to mount in container-executor.cfg. The main advantage to something along these lines is that it preserves the existing cgroups hiearchy, and no additional code is needed to deal with cgroup parameters. The other advantage is that we are pre-creating the hadoop-yarn cgroups with the correct owner/permissions - docker creates them as root. At this point, I'm not sure if I should proceed with this approach and I'm looking for opinions. The options I am considering are: # The approach I've been experimenting with, cleaned up # The minimal, just-fix-the-leak approach, which would be to add a cleanupCGroups() method to the runtime. ** We call it after calling the ResourceHandlers.postComplete() in LCE. ** Docker would be the only runtime that implements it. ** We'd need to add a container-executor function to handle it. ** It could search for the containerId cgroup under all mounted cgroups and delete any that it finds *** Would not delete any that still have processes *** Security concerns? # The let-docker-be-docker approach ** This is the change-the-cgroup-parent approach. Instead of passing /hadoop-yarn/containerId, we would just use /hadoop-yarn and let docker create its dockerContainerId cgroups under there. ** Solves the leak by just letting docker handle it - no intermediate containerId cgroups are created, so they don't need to be deleted by NM. ** To do this, I think we'd need to change every Cgroups ResourceHandler to do something different for Docker. The main ones are for blkio and cpu. *** Don't create the containerId cgroups *** Don't modify cgroup params directly. *** Return the /hadoop-yarn/tasks path for the ADD_PID_TO_CGROUP operation so we set the cgroup parent correctly. *** Would likely need to add new PriviledgedOps for each cgroup parameter to pass them through (these are returned by ResourceHandler.preStart()). *** Add code to add each new cgroup parameter to docker run. *** Would need to support updating params via docker update command to support the ResourceHandler.updateContainer() method. *** [~billie.rinaldi], I've thought a bit more about the docker in docker case, which we thought would be a problem with this approach. I think it is solvable though - you can obtain the name of the docker cgroup from /proc/self/cgroup. I don't know if this is workable for your use-case though? Comments? Concerns? Alternatives? cc:[~jlowe], [~ebadger], [~shaneku...@gmail.com], [~billie.rinaldi], [~eyang] > Container cgroups are leaked when using docker >
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580234#comment-16580234 ] Jim Brennan commented on YARN-8648: --- {quote}I am wondering if this approach would break the docker launching docker use case. Currently if you're launching a new docker container from an existing docker container, you can have the new container use the same cgroup as the first container (e.g. /hadoop-yarn/${CONTAINER_ID}), but if there weren't a unique cgroup parent for the container you wouldn't be able to do that. Unless there's a way to find out the docker container id from inside the container? {quote} Thanks [~billie.rinaldi]. Yes I think this use-case would break as you suggest. {quote}One potential issue with a useResourceHandlers() approach is if the NM wants to manipulate cgroup settings on a live container. Having a runtime that says it doesn't use resource handlers implies that can't be done by that runtime, but it can be supported by the docker runtime {quote} Agreed [~jlowe]. I no longer think useResourceHandlers() is a good approach I don't have a full solution in mind yet, but one question is whether we should continue using the per-container cgroup as the cgroup parent for docker. The main advantage to maintaining it is that there is a lot of code that already depends on it. All existing resource handlers just work with this setup. The disadvantage is that it makes fixing the leak harder because docker is creating hierarchies under the unused resource types (cpuset, hugetlb, etc...) and it creates them as root making it harder for the NM to remove them. If we use the top-level (hadoop-yarn) as the cgroup parent, then docker cleans everything up pretty nicely (although it still leaks the top-level hadoop-yarn cgroup for the unused-by-NM resources. But it breaks the case [~billie.rinaldi] mentioned above, and requires that we convert all existing resource handlers to use docker command options in the docker case. One thought I had is adding a dockerCleaupResourceHandler that we tack on to the end of the resourceHandlerChain - it's only job would be to clean up the extra container cgroups that docker creates. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580149#comment-16580149 ] Billie Rinaldi commented on YARN-8648: -- I am wondering if this approach would break the docker launching docker use case. Currently if you're launching a new docker container from an existing docker container, you can have the new container use the same cgroup as the first container (e.g. /hadoop-yarn/${CONTAINER_ID}), but if there weren't a unique cgroup parent for the container you wouldn't be able to do that. Unless there's a way to find out the docker container id from inside the container? > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580081#comment-16580081 ] Jason Lowe commented on YARN-8648: -- bq. Is it worth breaking cgroups parameters temporarily for docker to fix the leak? No, I was under the impression we would modify the docker runtime to port the settings over for the container launch as we did internally for CPU shares. However you're correct that anything calling ResourceHandler#updateContainer won't have an effect until this is rightfully fixed, so we should probably address this all at once to avoid regressions elsewhere when fixing the leak. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580067#comment-16580067 ] Jim Brennan commented on YARN-8648: --- [~jlowe] thanks for the comment. {quote}We should consider breaking this up into two JIRAs if it proves difficult to hash through the design. It's a relatively small change to move the docker containers under the top-level YARN cgroup hierarchy to fixes the cgroup leaks, with the side-effect that the NM continues to create and cleanup unused cgroups per docker container launched. We could follow up that change with another JIRA to resolve the new design for the cgroup / container runtime interaction so those empty cgroups are avoided in the docker case. If we can hash it out quickly in one JIRA that's great, but I want to make sure the leak problem doesn't linger while we work through the architecture of cgroups and container runtimes. {quote} The main issue with doing this quick fix for the cgroups leak is that any cgroup parameters written by the various resource handlers will be ignored in the docker case because they will be written to the unused container cgroup. Internally, we added a cpu-shares option to docker to handle the cpu resource because that is the only one we're using, but for community I think we need to address them all. Is it worth breaking cgroups parameters temporarily for docker to fix the leak? > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16580025#comment-16580025 ] Jason Lowe commented on YARN-8648: -- +1 for the proposal to fix the cgroup leak by having docker place its cgroup at the same level YARN is creating its containers. That preserves the semantic hierarchy of the hadoop-yarn cgroup in case administrators are explicitly setting controls on the entire YARN container hierarchy so it plays nice with other systems on the node. It also preserves the cgroup sharing semantics between nodes performing a mix of container launches where some are docker and others are native. Now the hard part is figuring out the cleanest way to have the NM create cgroups in the native case and avoid creating the cgroup in the docker case. This implies the container runtime needs some say in how cgroups are handled if it isn't delegated to the container runtime completely. One potential issue with a useResourceHandlers() approach is if the NM wants to manipulate cgroup settings on a live container. Having a runtime that says it doesn't use resource handlers implies that can't be done by that runtime, but it can be supported by the docker runtime. We should consider breaking this up into two JIRAs if it proves difficult to hash through the design. It's a relatively small change to move the docker containers under the top-level YARN cgroup hierarchy to fixes the cgroup leaks, with the side-effect that the NM continues to create and cleanup unused cgroups per docker container launched. We could follow up that change with another JIRA to resolve the new design for the cgroup / container runtime interaction so those empty cgroups are avoided in the docker case. If we can hash it out quickly in one JIRA that's great, but I want to make sure the leak problem doesn't linger while we work through the architecture of cgroups and container runtimes. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576916#comment-16576916 ] Jim Brennan commented on YARN-8648: --- One proposal to fix the leaking cgroups is to have docker put its containers directly under the {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy}} directory. For example, instead of using {{cgroup-parent=/hadoop-yarn/container_id-}}, we use {{cgroup-parent=/hadoop-yarn}}. This does cause docker to create a {{hadoop-yarn}} cgroup under each resource type, and it does not clean those up, but that is just one unused cgroup per resource type vs hundreds of thousands. This can be done by just passing an empty string to DockerLinuxContainerRuntime.addCGroupParentIfRequired(), or otherwise changing it to ignore the containerIdStr. Doing this and removing the code that cherry-picks the PID in container-executor does work, but the NM still creates the per-container cgroups as well - they're just not used. The other issue with this approach is that the cpu.shares is still updated (to reflect the requested vcores allotment) in the per-container cgroup, so it is ignored. In our code, we addressed this by passing the cpu.shares value in the docker run --cpu-shares command line argument. I'm still thinking about the best way to address this. Currently most of the resourceHandler processing happens at the linuxContainerExecutor level. But there is clearly a difference in how cgroups need to be handled for docker vs linux cases. In the docker case, we should arguably use docker command line arguments instead of directly setting up cgroups. One option would be to provide a runtime interface useResourceHandlers() which for Docker would return false. We could then disable all of the resource handling processing that happens in the container executor, and add the necessary interfaces to handle cgroup parameters to the docker runtime. Another option would be to move the resource handler processing down into the runtime. This is a bigger change, but may be cleaner. The docker runtime may still just ignore those handlers, but that detail would be hidden at the container executor level. cc:, [~ebadger] [~jlowe] [~eyang] [~shaneku...@gmail.com] [~billie.rinaldi] > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576855#comment-16576855 ] Jim Brennan commented on YARN-8648: --- Another problem we have seen is that container-executor still has code that cherry-picks the PID of the launch shell from the docker container and writes that into the {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/tasks}} file, effectively moving it from {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}} to {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id}}. So you end up with one process out of the container in the {{container_id}} cgroup, and the rest in the {{container_id/docker_container_id}} cgroup. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org