[jira] [Commented] (YARN-3920) FairScheduler Reserving a node for a container should be configurable to allow it used only for large containers
[ https://issues.apache.org/jira/browse/YARN-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658674#comment-14658674 ] Matthew Jacobs commented on YARN-3920: -- [~adhoot] {quote} The problem is if you get that too high (such that it exceeds maximum resource allocation) one can accidentally disable reservation. {quote} That might be desired in some circumstances, no? FairScheduler Reserving a node for a container should be configurable to allow it used only for large containers Key: YARN-3920 URL: https://issues.apache.org/jira/browse/YARN-3920 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: yARN-3920.001.patch, yARN-3920.002.patch Reserving a node for a container was designed for preventing large containers from starvation from small requests that keep getting into a node. Today we let this be used even for a small container request. This has a huge impact on scheduling since we block other scheduling requests until that reservation is fulfilled. We should make this configurable so its impact can be minimized by limiting it for large container requests as originally intended. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7
[ https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14569778#comment-14569778 ] Matthew Jacobs commented on YARN-2194: -- I'm confused, does this mean that you'll re-mount the cpu and cpuacct controllers? Do we know that other components in the RHEL7 world don't expect them to be in the default place? Cgroups cease to work in RHEL7 -- Key: YARN-2194 URL: https://issues.apache.org/jira/browse/YARN-2194 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.7.0 Reporter: Wei Yan Assignee: Wei Yan Priority: Critical Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch In RHEL7, the CPU controller is named cpu,cpuacct. The comma in the controller name leads to container launch failure. RHEL7 deprecates libcgroup and recommends the user of systemd. However, systemd has certain shortcomings as identified in this JIRA (see comments). This JIRA only fixes the failure, and doesn't try to use systemd. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7
[ https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14570040#comment-14570040 ] Matthew Jacobs commented on YARN-2194: -- Thanks, [sidharta-s]. So the change would be in how the container-executor accepts lists of paths, not attempting to re-mount the controllers, right? If I understand it correctly, that sounds like a good plan to me. Cgroups cease to work in RHEL7 -- Key: YARN-2194 URL: https://issues.apache.org/jira/browse/YARN-2194 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.7.0 Reporter: Wei Yan Assignee: Wei Yan Priority: Critical Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch In RHEL7, the CPU controller is named cpu,cpuacct. The comma in the controller name leads to container launch failure. RHEL7 deprecates libcgroup and recommends the user of systemd. However, systemd has certain shortcomings as identified in this JIRA (see comments). This JIRA only fixes the failure, and doesn't try to use systemd. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2194) Cgroups cease to work in RHEL7
[ https://issues.apache.org/jira/browse/YARN-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568005#comment-14568005 ] Matthew Jacobs commented on YARN-2194: -- While this may work for the default RHEL7 configuration, this will break if someone happens to have mounted the same controllers to /sys/fs/cgroup/cpuacct,cpu, or if the user mounted other controllers at the same path as well. What do you think about creating the symlink from /sys/fs/cgroup/cpu to the mounted path for cpu in all cases (unless it was actually mounted at /sys/fs/cgroup/cpu of course). Cgroups cease to work in RHEL7 -- Key: YARN-2194 URL: https://issues.apache.org/jira/browse/YARN-2194 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.7.0 Reporter: Wei Yan Assignee: Wei Yan Priority: Critical Attachments: YARN-2194-1.patch, YARN-2194-2.patch, YARN-2194-3.patch In RHEL7, the CPU controller is named cpu,cpuacct. The comma in the controller name leads to container launch failure. RHEL7 deprecates libcgroup and recommends the user of systemd. However, systemd has certain shortcomings as identified in this JIRA (see comments). This JIRA only fixes the failure, and doesn't try to use systemd. -- This message was sent by Atlassian JIRA (v6.3.4#6332)