[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588429#comment-14588429 ] Chris Douglas commented on YARN-3119: - Systems that embrace more forgiving resource enforcement are difficult to tune, particularly if those jobs run in multiple environments with different constraints (as is common when moving from research/test to production). If jobs silently and implicitly use more resources than requested, then users only learn that their container is under-provisioned when the cluster workload shifts, and their pipelines start to fail. I agree with [~aw]'s [feedback|https://issues.apache.org/jira/browse/YARN-3119?focusedCommentId=14303956&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14303956]. If this workaround is committed, this should be disabled by default and strongly discouraged. > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304225#comment-14304225 ] Wangda Tan commented on YARN-3119: -- [~vinodkv], That makes sense to me, We can even relax restriction (or make it configurable) of {{kill the over-limit container if it exceeds scheduler.maximum-allocation-mb of the cluster.}} if we enforce {{we only let's containers grow when there is capacity that is not allocated for any containers}}. Maybe we need introduce a "resource tracker" in NM to track used resource / allocated resource. [~adhoot], I think what [~vinodkv] suggested is not make total_mem_usage_check configurable, because it potentially makes container much more easily get OOM exception when its usage under its allocated resource, that is a bad behavior. > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304170#comment-14304170 ] Anubhav Dhoot commented on YARN-3119: - Thanks for the comments. Just wanted to give a quick response on the configurability. In my prelim patch I did add the option to turn it off by default though i could use a better name (DEFAULT_TOTAL_MEM_USAGE_CHECK_ENABLED is set to false). I also added a configurable threshold that lets you control how much of the overall YARN container memory is allowed to be exceeded before we trigger the enforcement. (Setting it to 0 would be another way to revert back to old behavior) > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304157#comment-14304157 ] Vinod Kumar Vavilapalli commented on YARN-3119: --- bq. IMHO, this could be problematic if a under-usage container (c1) wants to get more resource, but the resource is over-used by another container (c2). It is possible c1 tries to allocate but failed since memory is exhausted since NM needs some time to get resource back (kill c2). This is the counter use-case that kills the feature. Leads to some pretty bad experience for apps for no fault of theirs. Though I do remember this being proposed back at Yahoo! by [~rajesh.balamohan] few years ago - this helps improve throughput albeit with a loss of predictability. Let's have a flag to turn this off by default. When it is on, we can do better: - kill the over-limit container if it exceeds scheduler.maximum-allocation-mb of the cluster. - let a container grow beyond its reservation if and only if there is unreserved capacity on the node. This way, we only let's containers grow when there is capacity that is not allocated for any containers. The limiting constraint is the following: {code}Sum of excess capacity used by all containers beyond their individual limits <= Unreserved capacity on the node.{code} - when aggregate usage crosses the threshold, only kill enough containers as is necessary instead of everything. Anything more that can help? [~rajesh.balamohan]/[~leftnoteasy]? > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303956#comment-14303956 ] Allen Wittenauer commented on YARN-3119: "Snow leopards already have poachers so it's ok if I kill this one." I think the only way this would be acceptable is if two things happen: a) It is configurable to turn this functionality off. b) The setting in the yarn-default.xml file came with a big warning saying that it may lead to system instability and/or unpredictability. > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301698#comment-14301698 ] Anubhav Dhoot commented on YARN-3119: - Thanks [~leftnoteasy], [~aw] for your comments. All of the problems listed are already possible with the current memory management model which uses monitoring and not cgroups. > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299333#comment-14299333 ] Allen Wittenauer commented on YARN-3119: It's also worth acknowledging that we already have problems with things like Tez that can and do grow memory usage so fast that it blows the kernel up before YARN can kill it. I'd rather tighten the restrictions using cgroups than loosen them. > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299326#comment-14299326 ] Wangda Tan commented on YARN-3119: -- IMHO, this could be problematic if a under-usage container (c1) wants to get more resource, but the resource is over-used by another container (c2). It is possible c1 tries to allocate but failed since memory is exhausted since NM needs some time to get resource back (kill c2). > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299304#comment-14299304 ] Anubhav Dhoot commented on YARN-3119: - The scheduling should continue like before where we schedule containers. If the new container causes the ratio to be exceeded we would kill the offending containers. In case we dont exceed the limit, the offending containers could be given a chance to succeed which can improve throughput of jobs that have skews like this. If multiple containers are over limit they are all deleted for now. In future we can be more sophisticated that we kill containers in reverse order of the amount they exceed by or some other criteria, until we go back below the ratio. That would be a good second improvement over this. In general this jira attempts to make memory a little more of a flexible resource. > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3119) Memory limit check need not be enforced unless aggregate usage of all containers is near limit
[ https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298876#comment-14298876 ] Allen Wittenauer commented on YARN-3119: How should scheduling behave in this scenario? What happens if multiple containers are over their limit and/or what order are containers killed? > Memory limit check need not be enforced unless aggregate usage of all > containers is near limit > -- > > Key: YARN-3119 > URL: https://issues.apache.org/jira/browse/YARN-3119 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3119.prelim.patch > > > Today we kill any container preemptively even if the total usage of > containers for that is well within the limit for YARN. Instead if we enforce > memory limit only if the total limit of all containers is close to some > configurable ratio of overall memory assigned to containers, we can allow for > flexibility in container memory usage without adverse effects. This is > similar in principle to how cgroups uses soft_limit_in_bytes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)