[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196837#comment-15196837 ] Karthik Kambatla commented on YARN-1856: [~vinodkv] - I have filed YARN-4599 for the same. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183694#comment-15183694 ] Vinod Kumar Vavilapalli commented on YARN-1856: --- [~vvasudev] / [~kasha], can you file tickets for some of your proposals above so that we don't drop them on the floor? Tx. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094888#comment-15094888 ] Vinod Kumar Vavilapalli commented on YARN-1856: --- [~vvasudev] / [~kasha], seems like there are a couple of key proposals here, let's fork them off to separate tickets so they get the deserved attention. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081359#comment-15081359 ] Karthik Kambatla commented on YARN-1856: Was catching up on YARN-3 (the JIRA that added cgroups) to see why we decided to not use it for enforcing memory. [~bikassaha] has some [valid points|https://issues.apache.org/jira/browse/YARN-3?focusedCommentId=13414567&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13414567] on not letting the kernel (through cgroups) kill processes that go over their allocated limits. To get the best of both worlds: I feel we should disable oom_control so the processes are paused but not killed. Thoughts? > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071322#comment-15071322 ] Karthik Kambatla commented on YARN-1856: bq. Ideally oom_control, swappiness would be set by the AM/YARN client and should be container specific settings. If we don't disable oom_control, wouldn't the current implementation kill containers as soon as they spike their usage over the configured hard limit which appears to be the container size? I feel this is too aggressive especially considering how a delayed GC could cause this so easily. No? I see your point about an application deciding whether its containers should be paused/killed. I think the default should be paused, i.e., disabled. bq. In general, we need an API to set container executor specific settings - we've seen a need for this when adding Docker support and now for CGroups settings as well. Would like to understand this better. May be we should take this to another JIRA. I am open to discussing this offline before filing this JIRA and posting our thoughts there. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067584#comment-15067584 ] Varun Vasudev commented on YARN-1856: - [~kasha] - we haven't provided a flag for using oom_control, but we did provide a control to set swappiness(which currently is a config setting). Ideally oom_control, swappiness would be set by the AM/YARN client and should be container specific settings. In general, we need an API to set container executor specific settings - we've seen a need for this when adding Docker support and now for CGroups settings as well. If you'd like to work on it, is it possible to come up with an abstraction that'll solve the Docker issues as well? [~sidharta-s] and I can help provide context on the Docker use case. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065570#comment-15065570 ] Karthik Kambatla commented on YARN-1856: Thanks [~vvasudev] for working on this, [~sidharta-s] and [~vinodkv] for the reviews. Excited to see this land. Just checking - is there a JIRA for using memory.oom_control? If we don't disable oom_control, using the new cgroups-based monitoring/enforcing would be a lot more stricter compared to the proc-fs based checks and could lead to several task/job failures on existing clusters. OTOH, we might want to enable oom_control for opportunistic containers to be used in YARN-2877 and YARN-1011. If there is no JIRA yet and you guys are caught up, I am happy to file one and work on it. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062744#comment-15062744 ] Hudson commented on YARN-1856: -- FAILURE: Integrated in Hadoop-trunk-Commit #8986 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8986/]) YARN-1856. Added cgroups based memory monitoring for containers as (vinodkv: rev 4e7d32c0db69882cde854ef581056142a997c005) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfigurationFields.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsMemoryResourceHandlerImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/MemoryResourceHandler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsMemoryResourceHandlerImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/ResourceHandlerModule.java * hadoop-yarn-project/CHANGES.txt > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062704#comment-15062704 ] Vinod Kumar Vavilapalli commented on YARN-1856: --- The style warnings are mostly non-fixable, what with long files and package names. +1 for splitting out the yarn-default.xml changes. The latest patch looks good to me, +1, checking this in. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15050492#comment-15050492 ] Hadoop QA commented on YARN-1856: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 14s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 49s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 11s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 35s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 11s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 47s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 47s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 36s {color} | {color:red} Patch generated 11 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 240, now 248). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 35s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 34s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 26s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 7s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 0s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12776756/YARN-1856.004.patch | | JIRA Issue | YARN-1856 | | Optional Tests |
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047400#comment-15047400 ] Vinod Kumar Vavilapalli commented on YARN-1856: --- Quick comments on the patch: General - Should add all the configs to yarn-default.xml, saying they are still early configs? - Should update the documentation of pmem-check-enabled, vmem-check-enabled configs in code and yarn-default.xml to denote their relation to resource.memory.enabled. - Actually, given existing memory monitoring mechanism, NM_MEMORY_RESOURCE_ENABLED is in reality is already true when pmem/vmem checks are enabled. We need to reconcile the old and new configs some how. May be memory is always enabled, but if vmem/pmem configs are enabled, use old handler, otherwise use the new one? Thinking out aloud. - Does the soft and hard limits also some-how logically relate to pmem-vmem-ratio? If so, we should hint at that in the documentation. - Swappiness seems like a cluster configuration defaulting to zero. So far, this has been an implicit contract with our users, good to document this also in yarn-default.xml Code comments - ResourceHandlerModule -- Formatting of new code is a little off: the declaration of {{getCgroupsMemoryResourceHandler()}}. There are other occurrences like this in that class before in this patch, you may want to fix those. -- BUG! getCgroupsMemoryResourceHandler() incorrectly locks DiskResourceHandler instead of MemoryResourceHandler. - CGroupsMemoryResourceHandlerImpl -- What is this doing? {{ CGroupsHandler.CGroupController MEMORY = CGroupsHandler.CGroupController.MEMORY; }} Is it forcing a class-load or something? Not sure if this is needed. If this is needed, you may want to add a comment here. - NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERC -> NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERCENTAGE. Similarly the default constant. - CGROUP_PARAM_MEMORY_HARD_LIMIT_BYTES / CGROUP_PARAM_MEMORY_SOFT_LIMIT_BYTES / CGROUP_PARAM_MEMORY_SWAPPINESS can all be static and final. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047156#comment-15047156 ] Sidharta Seethana commented on YARN-1856: - Ugh. IDE Snafu - someone how ended looking at an older version of the patch. +1 on the latest version of the patch. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047082#comment-15047082 ] Sidharta Seethana commented on YARN-1856: - [~vvasudev] , there is still an issue with the handling of the soft limit percentage. Isn't there a divide by 100 missing? {code} long softLimit = (long) (container.getResource().getMemory() * softLimitPerc); {code} The test code below needs to be updated too - instead of specifying the value of the soft limit percentage here in test code, maybe we should use DEFAULT_NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERC ? It also looks like the validation of the memory value is not happening correctly below. You could use Mockito's {{eq()}} to verify argument values. {code} verify(mockCGroupsHandler, times(1)) .updateCGroupParam(CGroupsHandler.CGroupController.MEMORY, id, CGroupsHandler.CGROUP_PARAM_MEMORY_SOFT_LIMIT_BYTES, String.valueOf((int) (memory * 0.9)) + "M"); {code} > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046559#comment-15046559 ] Varun Vasudev commented on YARN-1856: - The license warnings aren't from the patch. They're coming from resourcemanager test files. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046551#comment-15046551 ] Hadoop QA commented on YARN-1856: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 21s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s {color} | {color:green} trunk passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 23s {color} | {color:green} trunk passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 22s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 29s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s {color} | {color:red} Patch generated 8 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 240, now 247). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 19s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 14s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 31s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_85. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 21s {color} | {color:red} Patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 27s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12776258/YARN-1856.003.patch | | JIRA Issue | YARN-1856 | | Optional Tests | asflicense compi
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045966#comment-15045966 ] Sidharta Seethana commented on YARN-1856: - Hi [~vvasudev], some comments below on the latest version of the patch : {code} String CGROUP_PARAM_HARD_LIMIT = "limit_in_bytes"; String CGROUP_PARAM_SOFT_LIMIT = "soft_limit_in_bytes"; String CGROUP_PARAM_SWAPPINESS = "swappiness"; {code} The constants should have ‘MEMORY’ in their names. For example, CGROUP_PARAM_HARD_LIMIT is better named as CGROUP_PARAM_MEMORY_HARD_LIMIT in order to avoid future collisions. This is similar to how BLKIO is used in the previous line (classid should be fixed at some point too) {code} @Private public static final String NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERC = NM_MEMORY_RESOURCE_PREFIX + "cgroups.soft-limit-percentage"; @Private public static final float DEFAULT_NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERC = 0.9f; {code} {code} softLimitPerc = conf.getFloat( YarnConfiguration.NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERC, YarnConfiguration.DEFAULT_NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERC); if (softLimitPerc < 0.0f || softLimitPerc > 100.0f) { throw new ResourceHandlerException( "Illegal value '" + softLimitPerc + "' " + YarnConfiguration.NM_MEMORY_RESOURCE_CGROUPS_SOFT_LIMIT_PERC + ". Value must be between 0 and 100."); } {code} Is the soft limit conf setting meant to represent a percentage or is it a fraction between 0 and 1? From the default value of 0.9f and the application of the soft limit it appears to be a fraction, but the name of the setting and its validation check seem to indicate that it is meant to be a percentage. This needs to be fixed. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch, YARN-1856.002.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045515#comment-15045515 ] Hadoop QA commented on YARN-1856: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 19s {color} | {color:green} trunk passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s {color} | {color:green} trunk passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 19s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s {color} | {color:red} Patch generated 7 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 240, now 246). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 18s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 6s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 35s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 56m 58s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12776140/YARN-1856.002.patch | | JIRA Issue | YARN-1856 | | Optional Tests | asfl
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15042505#comment-15042505 ] Sidharta Seethana commented on YARN-1856: - Hi [~vvasudev], Thanks for the patch. Please see below for some comments. {code} @VisibleForTesting static final String HARD_LIMIT = "limit_in_bytes"; @VisibleForTesting static final String SOFT_LIMIT = "soft_limit_in_bytes"; @VisibleForTesting static final String SWAPPINESS = "swappiness”; {code} Could you move these constants to CGroupsHandler ? There are already some cgroups parameter constants defined there. {code} if (configuration.getBoolean(YarnConfiguration.NM_PMEM_CHECK_ENABLED, YarnConfiguration.DEFAULT_NM_PMEM_CHECK_ENABLED)) { LOG.warn("You have enabled the default YARN physical memory health" + " checker as well as the CGroups memory controller. This could" + " lead to unpredictable behaviour"); } {code} IMO, If the behavior here is unpredictable, we should simply error our here in case both are enabled. {code} long softLimit = (long) (container.getResource().getMemory() * 0.90f); {code} We should make the fraction configurable, I think. What are the implications of the soft limit? {code} public static final String NM_MEMORY_RESOURCE_PREFIX = NM_PREFIX + "resource.memory."; public static final String NM_MEMORY_RESOURCE_ENABLED = NM_MEMORY_RESOURCE_PREFIX + "enabled"; public static final boolean DEFAULT_NM_MEMORY_RESOURCE_ENABLED = false; public static final String NM_MEMORY_RESOURCE_CGROUPS_SWAPPINESS = NM_MEMORY_RESOURCE_PREFIX + "cgroups.swappiness"; public static final int DEFAULT_NM_MEMORY_RESOURCE_CGROUPS_SWAPPINESS = 0; {code} Since we are skipping changes to yarn-default.xml (based on changes I see in TestYarnConfigurationFields), these should be marked @Private , similar to how network/disk configs settings are annotated? Thinking aloud here : should we add support in some form for {{memory.oom_control}} and notifications/stats? > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Attachments: YARN-1856.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031677#comment-15031677 ] Hadoop QA commented on YARN-1856: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 30s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 35s {color} | {color:green} trunk passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 23s {color} | {color:green} trunk passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 31s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 35s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 35s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s {color} | {color:red} Patch generated 8 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 238, now 245). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 33s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 36s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 46s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_85. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 62m 1s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12774801/YARN-1856.001.patch | | JIRA Issue | YARN-1856 | | Optional Tests | asfl
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031351#comment-15031351 ] Varun Vasudev commented on YARN-1856: - Moved it out of 2.6.3 and 2.7.3. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15030098#comment-15030098 ] Junping Du commented on YARN-1856: -- Hi, can we move this out of 2.6.3? Thanks! > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692538#comment-14692538 ] Sangjin Lee commented on YARN-1856: --- Should this be targeted to 2.6.2? We're trying to release 2.6.1 soon. Let me know. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637645#comment-14637645 ] Varun Vasudev commented on YARN-1856: - Assigning to myself to take a crack at it. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259695#comment-14259695 ] Karthik Kambatla commented on YARN-1856: I haven't had a chance to work on this further. [~beckham007] - how did your testing go? Please feel free to take this JIRA over if you want to contribute what you guys have done. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101971#comment-14101971 ] Beckham007 commented on YARN-1856: -- We had work on this for a few days. We will validate it in our production envriment, which has 4000 nodes. We set memory.limit_in_bytes for /cgroup/memory/hadoop-yarn and set memory.soft_limit_in_byte for each container. Also, we use cgroup.event_control to handle oom event. Mesos used the similar policy for memory isolation. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941011#comment-13941011 ] Karthik Kambatla commented on YARN-1856: Nope. YARN-3, IIUC, is just for CPU. Also, we don't want to enforce memory through cgroups - this is just for monitoring. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941018#comment-13941018 ] Vinod Kumar Vavilapalli commented on YARN-1856: --- When we use cgroups, we don't need (and want) explicit monitoring. Cgroups are going to constrain memory usage of the process (and the tree) if the right values are set when creating the group. There were some discussions on YARN-3 and related JIRAs related to this. In essence, the ContainersMonitor is really a monitor to be used only when such a OS feature is not available to properly constrain memory-usage. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941024#comment-13941024 ] Karthik Kambatla commented on YARN-1856: bq. When we use cgroups, we don't need (and want) explicit monitoring. If we set the limits much higher than what we want to enforce, we can use them for monitoring instead. The goal, again, is not to enforce. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941023#comment-13941023 ] Karthik Kambatla commented on YARN-1856: As discussed on YARN-3, using cgroups for memory isolation/enforcement can be problematic as it enforces an upper-bound on the amount of memory tasks can consume and hence doesn't tolerate any momentary spikes. Using it for monitoring, however, would help address YARN-1747. I haven't yet looked at the cgroups-related source closely enough. Can post an update once I do that. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13940997#comment-13940997 ] Vinod Kumar Vavilapalli commented on YARN-1856: --- Duplicate of YARN-3? > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > -- This message was sent by Atlassian JIRA (v6.2#6252)