[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14387091#comment-14387091 ] Arun Suresh commented on YARN-2962: --- [~varun_saxena], Thanks for the patch. It looks pretty good.. also like the extensive testing. Couple of minor comments: # The {{getLeafAppIdNodePath(Sting appId)}} and {{getLeafAppIdNodePath(Sting appId, boolean createIfExists)}} looks pretty similar. They should probably be refactored so that the former calls the later with {{createIfExists = false}} ? # minor indentation errors (think you are using tabs) in {{loadRMAppState()}} and {{removeApplicationStateInternal()}} # in {{removeApplicationStateInternal()}}, cant you call {{getLeafAppIdNodePath}} instead ? Also was wondering, should we hard code the NO_INDEX_SPLITTING logic to 4 ? Essentially, is it always guaranteed that sequence number will always be exactly 4 digits ? I was thinking we just allow users to specify a split index. If split index == size of seqnum, then don't split.. etc. Either way, I am ok with the current implementation, just wondering if it was though of (And I guess it might reduce some if then checks) ZKRMStateStore: Limit the number of znodes under a znode Key: YARN-2962 URL: https://issues.apache.org/jira/browse/YARN-2962 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Varun Saxena Priority: Critical Attachments: YARN-2962.01.patch We ran into this issue where we were hitting the default ZK server message size configs, primarily because the message had too many znodes even though they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390076#comment-14390076 ] Arun Suresh commented on YARN-2962: --- .. alternative to starting the index from the front. ZKRMStateStore: Limit the number of znodes under a znode Key: YARN-2962 URL: https://issues.apache.org/jira/browse/YARN-2962 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Varun Saxena Priority: Critical Attachments: YARN-2962.01.patch We ran into this issue where we were hitting the default ZK server message size configs, primarily because the message had too many znodes even though they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14390074#comment-14390074 ] Arun Suresh commented on YARN-2962: --- Yup.. agreed, star index from the end is a better ZKRMStateStore: Limit the number of znodes under a znode Key: YARN-2962 URL: https://issues.apache.org/jira/browse/YARN-2962 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Varun Saxena Priority: Critical Attachments: YARN-2962.01.patch We ran into this issue where we were hitting the default ZK server message size configs, primarily because the message had too many znodes even though they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2962: -- Attachment: YARN-2962.3.patch Updating with a few minor fixes. ZKRMStateStore: Limit the number of znodes under a znode Key: YARN-2962 URL: https://issues.apache.org/jira/browse/YARN-2962 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Varun Saxena Priority: Critical Attachments: YARN-2962.01.patch, YARN-2962.2.patch, YARN-2962.3.patch We ran into this issue where we were hitting the default ZK server message size configs, primarily because the message had too many znodes even though they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3485) FairScheduler headroom calculation doesn't consider maxResources for Fifo and FairShare policies
[ https://issues.apache.org/jira/browse/YARN-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516013#comment-14516013 ] Arun Suresh commented on YARN-3485: --- +1, LGTM Thanks for working on this [~kasha] FairScheduler headroom calculation doesn't consider maxResources for Fifo and FairShare policies Key: YARN-3485 URL: https://issues.apache.org/jira/browse/YARN-3485 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: yarn-3485-1.patch, yarn-3485-prelim.patch FairScheduler's headroom calculations consider the fairshare and cluster-available-resources, and the fairshare has maxResources. However, for Fifo and Fairshare policies, the fairshare is used only for memory and not cpu. So, the scheduler ends up showing a higher headroom than is actually available. This could lead to applications waiting for resources far longer than then intend to. e.g. MAPREDUCE-6302. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2962: -- Attachment: YARN-2962.2.patch Updating Patch: * Rebased with trunk * As discussed in the comments, the split index is now evaluated from the end of the appId. ZKRMStateStore: Limit the number of znodes under a znode Key: YARN-2962 URL: https://issues.apache.org/jira/browse/YARN-2962 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Varun Saxena Priority: Critical Attachments: YARN-2962.01.patch, YARN-2962.2.patch We ran into this issue where we were hitting the default ZK server message size configs, primarily because the message had too many znodes even though they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526289#comment-14526289 ] Arun Suresh commented on YARN-2962: --- bq. we can simplify it without special hierarchies by having RM recursively read nodes all the time. As we already specifically look for application_ prefix to read app-date, having application_1234_0456 right next to a 00/ directory is simply going to work without much complexity Hmmm.. Currently, while reading, the code expects only leaf nodes to have data. We could modify it to continue to child nodes while loading RMState. But updates to an app state would require some thought. Consider updating state of app Id _100. The update code would have to first check both the /.._100 and /.._1/00 znodes. Also, retrieving state during load_all and update_single might be hairy.. there can be ambiguous paths since a node path might not be unique across the 2 schemes. For eg. /.._1 will exist both in the new and old scheme. In the old scheme it can contain data, but in the new scheme it shouldnt (it is an intermediate node for /.._1/\[00-99\]).. Although option 2 can be done, I'd prefer the your first suggestion (storing under RM_APP_ROOT/hierarchies). We can have the RM read the old style but new apps and updates to old apps will go under the new root. We can even delete the old scheme root if no children exist. ZKRMStateStore: Limit the number of znodes under a znode Key: YARN-2962 URL: https://issues.apache.org/jira/browse/YARN-2962 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Varun Saxena Priority: Critical Attachments: YARN-2962.01.patch, YARN-2962.2.patch, YARN-2962.3.patch We ran into this issue where we were hitting the default ZK server message size configs, primarily because the message had too many znodes even though they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3547: -- Labels: (was: BB2015-05-TBR) FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3547: -- Labels: BB2015-05-TBR (was: ) FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Labels: BB2015-05-TBR Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3395) [Fair Scheduler] Handle the user name correctly when user name is used as default queue name.
[ https://issues.apache.org/jira/browse/YARN-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3395: -- Labels: (was: BB2015-05-TBR) [Fair Scheduler] Handle the user name correctly when user name is used as default queue name. - Key: YARN-3395 URL: https://issues.apache.org/jira/browse/YARN-3395 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3395.000.patch Handle the user name correctly when user name is used as default queue name in fair scheduler. It will be better to remove the trailing and leading whitespace of the user name when we use user name as default queue name, otherwise it will be rejected by InvalidQueueNameException from QueueManager. I think it is reasonable to make this change, because we already did special handling for '.' in user name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3395) [Fair Scheduler] Handle the user name correctly when user name is used as default queue name.
[ https://issues.apache.org/jira/browse/YARN-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535070#comment-14535070 ] Arun Suresh commented on YARN-3395: --- Thanks for the patch [~zxu], I have just a minor nit : Since you already do a name.trim() once, we don't need to do it again before returning. [Fair Scheduler] Handle the user name correctly when user name is used as default queue name. - Key: YARN-3395 URL: https://issues.apache.org/jira/browse/YARN-3395 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3395.000.patch Handle the user name correctly when user name is used as default queue name in fair scheduler. It will be better to remove the trailing and leading whitespace of the user name when we use user name as default queue name, otherwise it will be rejected by InvalidQueueNameException from QueueManager. I think it is reasonable to make this change, because we already did special handling for '.' in user name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2684) FairScheduler should tolerate queue configuration changes across RM restarts
[ https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2684: -- Labels: BB2015-05-TBR (was: ) FairScheduler should tolerate queue configuration changes across RM restarts Key: YARN-2684 URL: https://issues.apache.org/jira/browse/YARN-2684 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler, resourcemanager Affects Versions: 2.5.1 Reporter: Karthik Kambatla Assignee: Rohith Priority: Critical Labels: BB2015-05-TBR Attachments: 0001-YARN-2684.patch, 0002-YARN-2684.patch YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2684) FairScheduler should if queue is removed during app recovery
[ https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2684: -- Summary: FairScheduler should if queue is removed during app recovery (was: FairScheduler should tolerate queue configuration changes across RM restarts) FairScheduler should if queue is removed during app recovery Key: YARN-2684 URL: https://issues.apache.org/jira/browse/YARN-2684 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler, resourcemanager Affects Versions: 2.5.1 Reporter: Karthik Kambatla Assignee: Rohith Priority: Critical Labels: BB2015-05-TBR Attachments: 0001-YARN-2684.patch, 0002-YARN-2684.patch YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2684) FairScheduler should fast-fail if queue is removed during app recovery
[ https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2684: -- Summary: FairScheduler should fast-fail if queue is removed during app recovery (was: FairScheduler should if queue is removed during app recovery) FairScheduler should fast-fail if queue is removed during app recovery -- Key: YARN-2684 URL: https://issues.apache.org/jira/browse/YARN-2684 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler, resourcemanager Affects Versions: 2.5.1 Reporter: Karthik Kambatla Assignee: Rohith Priority: Critical Labels: BB2015-05-TBR Attachments: 0001-YARN-2684.patch, 0002-YARN-2684.patch YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2684) FairScheduler should tolerate queue configuration changes across RM restarts
[ https://issues.apache.org/jira/browse/YARN-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2684: -- Labels: (was: BB2015-05-TBR) FairScheduler should tolerate queue configuration changes across RM restarts Key: YARN-2684 URL: https://issues.apache.org/jira/browse/YARN-2684 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler, resourcemanager Affects Versions: 2.5.1 Reporter: Karthik Kambatla Assignee: Rohith Priority: Critical Attachments: 0001-YARN-2684.patch, 0002-YARN-2684.patch YARN-2308 fixes this issue for CS, this JIRA is to fix it for FS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1297) Miscellaneous Fair Scheduler speedups
[ https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-1297: -- Attachment: YARN-1297.3.patch Updating patch.. fixing it up to so it rebases to trunk Miscellaneous Fair Scheduler speedups - Key: YARN-1297 URL: https://issues.apache.org/jira/browse/YARN-1297 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Labels: BB2015-05-TBR Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, YARN-1297.patch, YARN-1297.patch I ran the Fair Scheduler's core scheduling loop through a profiler tool and identified a bunch of minimally invasive changes that can shave off a few milliseconds. The main one is demoting a couple INFO log messages to DEBUG, which brought my benchmark down from 16000 ms to 6000. A few others (which had way less of an impact) were * Most of the time in comparisons was being spent in Math.signum. I switched this to direct ifs and elses and it halved the percent of time spent in comparisons. * I removed some unnecessary instantiations of Resource objects * I made it so that queues' usage wasn't calculated from the applications up each time getResourceUsage was called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1297) Miscellaneous Fair Scheduler speedups
[ https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-1297: -- Attachment: YARN-1297.4.patch Updating patch to fix the test failure * Had missed accounting for app container recovery during scheduler recovery. Miscellaneous Fair Scheduler speedups - Key: YARN-1297 URL: https://issues.apache.org/jira/browse/YARN-1297 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Sandy Ryza Assignee: Arun Suresh Labels: BB2015-05-TBR Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, YARN-1297.4.patch, YARN-1297.patch, YARN-1297.patch I ran the Fair Scheduler's core scheduling loop through a profiler tool and identified a bunch of minimally invasive changes that can shave off a few milliseconds. The main one is demoting a couple INFO log messages to DEBUG, which brought my benchmark down from 16000 ms to 6000. A few others (which had way less of an impact) were * Most of the time in comparisons was being spent in Math.signum. I switched this to direct ifs and elses and it halved the percent of time spent in comparisons. * I removed some unnecessary instantiations of Resource objects * I made it so that queues' usage wasn't calculated from the applications up each time getResourceUsage was called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3676) Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests
Arun Suresh created YARN-3676: - Summary: Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests Key: YARN-3676 URL: https://issues.apache.org/jira/browse/YARN-3676 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh AssignMultiple is generally set to false to prevent overloading a Node (for eg, new NMs that have just joined) A possible scheduling optimization would be to disregard this directive for apps whose allowed locality is NODE_LOCAL -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3655) FairScheduler: potential livelock due to maxAMShare limitation and container reservation
[ https://issues.apache.org/jira/browse/YARN-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549547#comment-14549547 ] Arun Suresh commented on YARN-3655: --- Thanks for the patch [~zxu], I was just wondering though.. with your approach, assume the following situation (please correct me if I am wrong) * We have 3 nodes with say 4GB capacity. * Currently, applications are using up 3GB on each node (assume they are all fairly long running tasks..). * At time T1, A new app (appX) is added, and requires 2 GB. * At some time T2, the next allocation event (after all nodes have sent heartbeat.. or after a continuousScheduling attempt) happens, a reservation of 2GB is made on each node for appX. * At some time T3, during the next allocation event, As per your patch, the reservation for appX will be removed from ALL nodes.. * Thus reservations for appX will flip-flop on all nodes. It is possible that during the period when there is no reservation for appX. other apps with 1GB requirement might come in and be scheduled on the cluster... thereby starving appX FairScheduler: potential livelock due to maxAMShare limitation and container reservation - Key: YARN-3655 URL: https://issues.apache.org/jira/browse/YARN-3655 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3655.000.patch, YARN-3655.001.patch FairScheduler: potential livelock due to maxAMShare limitation and container reservation. If a node is reserved by an application, all the other applications don't have any chance to assign a new container on this node, unless the application which reserves the node assigns a new container on this node or releases the reserved container on this node. The problem is if an application tries to call assignReservedContainer and fail to get a new container due to maxAMShare limitation, it will block all other applications to use the nodes it reserves. If all other running applications can't release their AM containers due to being blocked by these reserved containers. A livelock situation can happen. The following is the code at FSAppAttempt#assignContainer which can cause this potential livelock. {code} // Check the AM resource usage for the leaf queue if (!isAmRunning() !getUnmanagedAM()) { ListResourceRequest ask = appSchedulingInfo.getAllResourceRequests(); if (ask.isEmpty() || !getQueue().canRunAppAM( ask.get(0).getCapability())) { if (LOG.isDebugEnabled()) { LOG.debug(Skipping allocation because maxAMShare limit would + be exceeded); } return Resources.none(); } } {code} To fix this issue, we can unreserve the node if we can't allocate the AM container on the node due to Max AM share limitation and the node is reserved by the application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues
[ https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549949#comment-14549949 ] Arun Suresh commented on YARN-3633: --- Thanks for the patch [~ragarwal], Assuming we allow, as per the patch, the first AM to be scheduled, then, as per the example you specified in the description, the AM will take up 3GB in an 5GB queue... presuming each worker task requires more resources that the AM (I am guessing this should be true for most cases), then no other task can be scheduled on that queue. and remaining queues are anyway log-jammed since the maxAMshare logic would kick in. Wondering if its a valid scenario.. With Fair Scheduler, cluster can logjam when there are too many queues -- Key: YARN-3633 URL: https://issues.apache.org/jira/browse/YARN-3633 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Rohit Agarwal Assignee: Rohit Agarwal Priority: Critical Attachments: YARN-3633.patch It's possible to logjam a cluster by submitting many applications at once in different queues. For example, let's say there is a cluster with 20GB of total memory. Let's say 4 users submit applications at the same time. The fair share of each queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the cluster logjams. Nothing gets scheduled even when 20GB of resources are available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1297) Miscellaneous Fair Scheduler speedups
[ https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-1297: -- Labels: (was: BB2015-05-TBR) Miscellaneous Fair Scheduler speedups - Key: YARN-1297 URL: https://issues.apache.org/jira/browse/YARN-1297 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Sandy Ryza Assignee: Arun Suresh Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, YARN-1297.4.patch, YARN-1297.patch, YARN-1297.patch I ran the Fair Scheduler's core scheduling loop through a profiler tool and identified a bunch of minimally invasive changes that can shave off a few milliseconds. The main one is demoting a couple INFO log messages to DEBUG, which brought my benchmark down from 16000 ms to 6000. A few others (which had way less of an impact) were * Most of the time in comparisons was being spent in Math.signum. I switched this to direct ifs and elses and it halved the percent of time spent in comparisons. * I removed some unnecessary instantiations of Resource objects * I made it so that queues' usage wasn't calculated from the applications up each time getResourceUsage was called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1297) Miscellaneous Fair Scheduler speedups
[ https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-1297: -- Attachment: YARN-1297.4.patch attaching again to kick off jenkins Miscellaneous Fair Scheduler speedups - Key: YARN-1297 URL: https://issues.apache.org/jira/browse/YARN-1297 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Sandy Ryza Assignee: Arun Suresh Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, YARN-1297.4.patch, YARN-1297.4.patch, YARN-1297.patch, YARN-1297.patch I ran the Fair Scheduler's core scheduling loop through a profiler tool and identified a bunch of minimally invasive changes that can shave off a few milliseconds. The main one is demoting a couple INFO log messages to DEBUG, which brought my benchmark down from 16000 ms to 6000. A few others (which had way less of an impact) were * Most of the time in comparisons was being spent in Math.signum. I switched this to direct ifs and elses and it halved the percent of time spent in comparisons. * I removed some unnecessary instantiations of Resource objects * I made it so that queues' usage wasn't calculated from the applications up each time getResourceUsage was called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2962) ZKRMStateStore: Limit the number of znodes under a znode
[ https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499939#comment-14499939 ] Arun Suresh commented on YARN-2962: --- [~varun_saxena], wondering if you need any help with this. Would like to get this in soon. ZKRMStateStore: Limit the number of znodes under a znode Key: YARN-2962 URL: https://issues.apache.org/jira/browse/YARN-2962 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Varun Saxena Priority: Critical Attachments: YARN-2962.01.patch We ran into this issue where we were hitting the default ZK server message size configs, primarily because the message had too many znodes even though they individually they were all small. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3470) Make PermissionStatusFormat public
Arun Suresh created YARN-3470: - Summary: Make PermissionStatusFormat public Key: YARN-3470 URL: https://issues.apache.org/jira/browse/YARN-3470 Project: Hadoop YARN Issue Type: Bug Reporter: Arun Suresh Priority: Minor implementations of {{INodeAttributeProvider}} are required to provide an implementation of {{getPermissionLong()}} method. Unfortunately, the long permission format is an encoding of the user, group and mode with each field converted to int using {{SerialNumberManager}} which is package protected. Thus it would be nice to make the {{PermissionStatusFormat}} enum public (and also make the {{toLong()}} static method public) so that user specified implementations of {{INodeAttributeProvider}} may use it. This would also make it more consistent with {{AclStatusFormat}} which I guess has been made public for the same reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3676) Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests
[ https://issues.apache.org/jira/browse/YARN-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3676: -- Attachment: YARN-3676.1.patch Attaching initial patch to validate approach.. will include test-case shortly Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests Key: YARN-3676 URL: https://issues.apache.org/jira/browse/YARN-3676 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh Attachments: YARN-3676.1.patch AssignMultiple is generally set to false to prevent overloading a Node (for eg, new NMs that have just joined) A possible scheduling optimization would be to disregard this directive for apps whose allowed locality is NODE_LOCAL -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3691) Limit number of reservations for an app
Arun Suresh created YARN-3691: - Summary: Limit number of reservations for an app Key: YARN-3691 URL: https://issues.apache.org/jira/browse/YARN-3691 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Currently, It is possible to reserve resource for an app on all nodes. Limiting this to possibly just a number of nodes (or a ratio of the total cluster size) would improve utilization of the cluster and will reduce the possibility of starving other apps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3691) Limit number of reservations for an app
[ https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-3691: - Assignee: Arun Suresh Limit number of reservations for an app --- Key: YARN-3691 URL: https://issues.apache.org/jira/browse/YARN-3691 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh Currently, It is possible to reserve resource for an app on all nodes. Limiting this to possibly just a number of nodes (or a ratio of the total cluster size) would improve utilization of the cluster and will reduce the possibility of starving other apps. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3762) FairScheduler: CME on FSParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14571512#comment-14571512 ] Arun Suresh commented on YARN-3762: --- Makes sense +1, LGTM FairScheduler: CME on FSParentQueue#getQueueUserAclInfo --- Key: YARN-3762 URL: https://issues.apache.org/jira/browse/YARN-3762 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: yarn-3762-1.patch, yarn-3762-1.patch, yarn-3762-2.patch In our testing, we ran into the following ConcurrentModificationException: {noformat} halxg.cloudera.com:8042, nodeRackName/rackvb07, nodeNumContainers0 15/05/22 13:02:22 INFO distributedshell.Client: Queue info, queueName=root.testyarnpool3, queueCurrentCapacity=0.0, queueMaxCapacity=-1.0, queueApplicationCount=0, queueChildQueueCount=0 15/05/22 13:02:22 FATAL distributedshell.Client: Error running Client java.util.ConcurrentModificationException: java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) at java.util.ArrayList$Itr.next(ArrayList.java:851) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.getQueueUserAclInfo(FSParentQueue.java:155) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.getQueueUserAclInfo(FairScheduler.java:1395) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:880) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
[ https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3453: -- Attachment: YARN-3453.2.patch Agreed.. Updating patch with your suggestion.. Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing Key: YARN-3453 URL: https://issues.apache.org/jira/browse/YARN-3453 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Ashwin Shankar Assignee: Arun Suresh Attachments: YARN-3453.1.patch, YARN-3453.2.patch There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode. Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator. Following are the two places : 1. {code:title=FSLeafQueue.java|borderStyle=solid} private boolean isStarved(Resource share) {code} A queue shouldn’t be marked as “starved” if the dominant resource usage is = fair/minshare. 2. {code:title=FairScheduler.java|borderStyle=solid} protected Resource resToPreempt(FSLeafQueue sched, long curTime) {code} -- One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
[ https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-3453: - Assignee: Arun Suresh Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing Key: YARN-3453 URL: https://issues.apache.org/jira/browse/YARN-3453 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Ashwin Shankar Assignee: Arun Suresh There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode. Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator. Following are the two places : 1. {code:title=FSLeafQueue.java|borderStyle=solid} private boolean isStarved(Resource share) {code} A queue shouldn’t be marked as “starved” if the dominant resource usage is = fair/minshare. 2. {code:title=FairScheduler.java|borderStyle=solid} protected Resource resToPreempt(FSLeafQueue sched, long curTime) {code} -- One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
[ https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3453: -- Attachment: YARN-3453.1.patch [~peng.zhang], [~ashwinshankar77], Thank you reporting this.. and the associated discussion I vote that we : # fix the {{isStarved()}} method to use the correct Calculator # fix the {{resToPreempt()}} method to use componentWiseMin for the target... but defer using the {{targetRatio}}, since it is probably an optimization and can be addressed in a future JIRA I have attached a preliminary patch that does this.. Will upload one with test cases shortly Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing Key: YARN-3453 URL: https://issues.apache.org/jira/browse/YARN-3453 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Ashwin Shankar Assignee: Arun Suresh Attachments: YARN-3453.1.patch There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode. Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator. Following are the two places : 1. {code:title=FSLeafQueue.java|borderStyle=solid} private boolean isStarved(Resource share) {code} A queue shouldn’t be marked as “starved” if the dominant resource usage is = fair/minshare. 2. {code:title=FairScheduler.java|borderStyle=solid} protected Resource resToPreempt(FSLeafQueue sched, long curTime) {code} -- One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3762) FairScheduler: CME on FSParentQueue#getQueueUserAclInfo
[ https://issues.apache.org/jira/browse/YARN-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14571409#comment-14571409 ] Arun Suresh commented on YARN-3762: --- [~kasha], Thanks for the patch * In the {{assignContainer}} method, you are sorting the collection under a write lock and then assigning the container within the read lock, what happens if the collection is modified in between.. shouldnt we have the write lock encompass both operations ? I agree, It will not lead to the Exception (which is the point of the JIRA) but I feel it should be done for correctness. * One other possible improvement is maybe instead of using a List and sorting it everytime, we could use a Sorted Bag (MultiSet) ? which keeps the list in sorted order ? FairScheduler: CME on FSParentQueue#getQueueUserAclInfo --- Key: YARN-3762 URL: https://issues.apache.org/jira/browse/YARN-3762 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: yarn-3762-1.patch, yarn-3762-1.patch In our testing, we ran into the following ConcurrentModificationException: {noformat} halxg.cloudera.com:8042, nodeRackName/rackvb07, nodeNumContainers0 15/05/22 13:02:22 INFO distributedshell.Client: Queue info, queueName=root.testyarnpool3, queueCurrentCapacity=0.0, queueMaxCapacity=-1.0, queueApplicationCount=0, queueChildQueueCount=0 15/05/22 13:02:22 FATAL distributedshell.Client: Error running Client java.util.ConcurrentModificationException: java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) at java.util.ArrayList$Itr.next(ArrayList.java:851) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.getQueueUserAclInfo(FSParentQueue.java:155) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.getQueueUserAclInfo(FairScheduler.java:1395) at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueUserAcls(ClientRMService.java:880) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2154) FairScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request
[ https://issues.apache.org/jira/browse/YARN-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615228#comment-14615228 ] Arun Suresh commented on YARN-2154: --- This looks like it is required for a complete solution to YARN-3453 [~kasha], Thanks for proposing this. If are you working on this actively, do you mind if I take it up ? I have a couple of ideas. FairScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request -- Key: YARN-2154 URL: https://issues.apache.org/jira/browse/YARN-2154 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Today, FairScheduler uses a spray-gun approach to preemption. Instead, it should only preempt resources that would satisfy the incoming request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3676) Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests
[ https://issues.apache.org/jira/browse/YARN-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3676: -- Attachment: YARN-3676.5.patch Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests Key: YARN-3676 URL: https://issues.apache.org/jira/browse/YARN-3676 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh Attachments: YARN-3676.1.patch, YARN-3676.2.patch, YARN-3676.3.patch, YARN-3676.4.patch, YARN-3676.5.patch AssignMultiple is generally set to false to prevent overloading a Node (for eg, new NMs that have just joined) A possible scheduling optimization would be to disregard this directive for apps whose allowed locality is NODE_LOCAL -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3655) FairScheduler: potential livelock due to maxAMShare limitation and container reservation
[ https://issues.apache.org/jira/browse/YARN-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554745#comment-14554745 ] Arun Suresh commented on YARN-3655: --- Thanks for the update [~zxu]. I have only one nit : In the {{okToUnreserve}} method, do we also need to do the {{!hasContainerForNode}} check ?? I assume the patch was meant to unreserve those containers which can fit in a node, but we put a limit on this if it exceeds maxAMShare.. FairScheduler: potential livelock due to maxAMShare limitation and container reservation - Key: YARN-3655 URL: https://issues.apache.org/jira/browse/YARN-3655 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3655.000.patch, YARN-3655.001.patch, YARN-3655.002.patch FairScheduler: potential livelock due to maxAMShare limitation and container reservation. If a node is reserved by an application, all the other applications don't have any chance to assign a new container on this node, unless the application which reserves the node assigns a new container on this node or releases the reserved container on this node. The problem is if an application tries to call assignReservedContainer and fail to get a new container due to maxAMShare limitation, it will block all other applications to use the nodes it reserves. If all other running applications can't release their AM containers due to being blocked by these reserved containers. A livelock situation can happen. The following is the code at FSAppAttempt#assignContainer which can cause this potential livelock. {code} // Check the AM resource usage for the leaf queue if (!isAmRunning() !getUnmanagedAM()) { ListResourceRequest ask = appSchedulingInfo.getAllResourceRequests(); if (ask.isEmpty() || !getQueue().canRunAppAM( ask.get(0).getCapability())) { if (LOG.isDebugEnabled()) { LOG.debug(Skipping allocation because maxAMShare limit would + be exceeded); } return Resources.none(); } } {code} To fix this issue, we can unreserve the node if we can't allocate the AM container on the node due to Max AM share limitation and the node is reserved by the application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node
[ https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14554814#comment-14554814 ] Arun Suresh commented on YARN-3675: --- Thanks for the patch [~adhoot], +1, LGTM FairScheduler: RM quits when node removal races with continousscheduling on the same node - Key: YARN-3675 URL: https://issues.apache.org/jira/browse/YARN-3675 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-3675.001.patch, YARN-3675.002.patch, YARN-3675.003.patch With continuous scheduling, scheduling can be done on a node thats just removed causing errors like below. {noformat} 12:28:53.782 AM FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Error in handling event type APP_ATTEMPT_REMOVED to the scheduler java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) at java.lang.Thread.run(Thread.java:745) 12:28:53.783 AMINFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye.. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3676) Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests
[ https://issues.apache.org/jira/browse/YARN-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3676: -- Attachment: YARN-3676.3.patch Updating patch with testcase Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests Key: YARN-3676 URL: https://issues.apache.org/jira/browse/YARN-3676 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh Attachments: YARN-3676.1.patch, YARN-3676.2.patch, YARN-3676.3.patch AssignMultiple is generally set to false to prevent overloading a Node (for eg, new NMs that have just joined) A possible scheduling optimization would be to disregard this directive for apps whose allowed locality is NODE_LOCAL -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3655) FairScheduler: potential livelock due to maxAMShare limitation and container reservation
[ https://issues.apache.org/jira/browse/YARN-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1466#comment-1466 ] Arun Suresh commented on YARN-3655: --- makes sense... +1 from me.. will commit, unless [~kasha] has any comments FairScheduler: potential livelock due to maxAMShare limitation and container reservation - Key: YARN-3655 URL: https://issues.apache.org/jira/browse/YARN-3655 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3655.000.patch, YARN-3655.001.patch, YARN-3655.002.patch FairScheduler: potential livelock due to maxAMShare limitation and container reservation. If a node is reserved by an application, all the other applications don't have any chance to assign a new container on this node, unless the application which reserves the node assigns a new container on this node or releases the reserved container on this node. The problem is if an application tries to call assignReservedContainer and fail to get a new container due to maxAMShare limitation, it will block all other applications to use the nodes it reserves. If all other running applications can't release their AM containers due to being blocked by these reserved containers. A livelock situation can happen. The following is the code at FSAppAttempt#assignContainer which can cause this potential livelock. {code} // Check the AM resource usage for the leaf queue if (!isAmRunning() !getUnmanagedAM()) { ListResourceRequest ask = appSchedulingInfo.getAllResourceRequests(); if (ask.isEmpty() || !getQueue().canRunAppAM( ask.get(0).getCapability())) { if (LOG.isDebugEnabled()) { LOG.debug(Skipping allocation because maxAMShare limit would + be exceeded); } return Resources.none(); } } {code} To fix this issue, we can unreserve the node if we can't allocate the AM container on the node due to Max AM share limitation and the node is reserved by the application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3676) Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests
[ https://issues.apache.org/jira/browse/YARN-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3676: -- Attachment: YARN-3676.4.patch Updating patch * Changed the accounting logic a bit * Improved test case * Fixed whitespace error Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests Key: YARN-3676 URL: https://issues.apache.org/jira/browse/YARN-3676 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh Attachments: YARN-3676.1.patch, YARN-3676.2.patch, YARN-3676.3.patch, YARN-3676.4.patch AssignMultiple is generally set to false to prevent overloading a Node (for eg, new NMs that have just joined) A possible scheduling optimization would be to disregard this directive for apps whose allowed locality is NODE_LOCAL -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3676) Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests
[ https://issues.apache.org/jira/browse/YARN-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3676: -- Attachment: YARN-3676.2.patch Reattaching patch. Disregard 'assignMultiple' directive while scheduling apps with NODE_LOCAL resource requests Key: YARN-3676 URL: https://issues.apache.org/jira/browse/YARN-3676 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh Attachments: YARN-3676.1.patch, YARN-3676.2.patch AssignMultiple is generally set to false to prevent overloading a Node (for eg, new NMs that have just joined) A possible scheduling optimization would be to disregard this directive for apps whose allowed locality is NODE_LOCAL -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
[ https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14615236#comment-14615236 ] Arun Suresh commented on YARN-3453: --- Apologize for the delay. Thanks for the review [~ashwinshankar77]. bq. Why are we not using componentwisemin here ? Yup.. this can be changed to componentwisemin. bq. FairScheduler.preemptResources() uses DefaultResourceCalculator and hence would look at only memory. Agree.. And as you mentioned, think we will have to fix YARN-2154 too. Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing Key: YARN-3453 URL: https://issues.apache.org/jira/browse/YARN-3453 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Ashwin Shankar Assignee: Arun Suresh Attachments: YARN-3453.1.patch, YARN-3453.2.patch There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode. Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator. Following are the two places : 1. {code:title=FSLeafQueue.java|borderStyle=solid} private boolean isStarved(Resource share) {code} A queue shouldn’t be marked as “starved” if the dominant resource usage is = fair/minshare. 2. {code:title=FairScheduler.java|borderStyle=solid} protected Resource resToPreempt(FSLeafQueue sched, long curTime) {code} -- One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4066) Large number of queues choke fair scheduler
[ https://issues.apache.org/jira/browse/YARN-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704320#comment-14704320 ] Arun Suresh commented on YARN-4066: --- [~johang], Thanks for reporting this... I don't think the patch has been successfully uploaded though. Kindly re-attach. Large number of queues choke fair scheduler --- Key: YARN-4066 URL: https://issues.apache.org/jira/browse/YARN-4066 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.1 Reporter: Johan Gustavsson Due to synchronization and all the loops performed during queue creation, setting a large amount of queues (12000+) will completely choke the scheduler. To deal with this some optimization to QueueManager.updateAllocationConfiguration(AllocationConfiguration queueConf) should be done to reduce the amount of unnesecary loops. The attached patch have been tested to work with atleast 96000 queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708318#comment-14708318 ] Arun Suresh commented on YARN-221: -- Looks like trunk does not compile correctly after this.. NM should provide a way for AM to tell it not to aggregate logs. Key: YARN-221 URL: https://issues.apache.org/jira/browse/YARN-221 Project: Hadoop YARN Issue Type: Sub-task Components: log-aggregation, nodemanager Reporter: Robert Joseph Evans Assignee: Ming Ma Fix For: 2.8.0 Attachments: YARN-221-6.patch, YARN-221-7.patch, YARN-221-8.patch, YARN-221-9.patch, YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch The NodeManager should provide a way for an AM to tell it that either the logs should not be aggregated, that they should be aggregated with a high priority, or that they should be aggregated but with a lower priority. The AM should be able to do this in the ContainerLaunch context to provide a default value, but should also be able to update the value when the container is released. This would allow for the NM to not aggregate logs in some cases, and avoid connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4066) Large number of queues choke fair scheduler
[ https://issues.apache.org/jira/browse/YARN-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14705094#comment-14705094 ] Arun Suresh commented on YARN-4066: --- +1 pending jenkins Large number of queues choke fair scheduler --- Key: YARN-4066 URL: https://issues.apache.org/jira/browse/YARN-4066 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.7.1 Reporter: Johan Gustavsson Attachments: yarn-4066-1.patch Due to synchronization and all the loops performed during queue creation, setting a large amount of queues (12000+) will completely choke the scheduler. To deal with this some optimization to QueueManager.updateAllocationConfiguration(AllocationConfiguration queueConf) should be done to reduce the amount of unnesecary loops. The attached patch have been tested to work with atleast 96000 queues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues
[ https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609119#comment-14609119 ] Arun Suresh commented on YARN-3633: --- [~ragarwal], Was just wondering.. wrt the scenario you mentioned in [here|https://issues.apache.org/jira/browse/YARN-3633?focusedCommentId=14542895page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14542895]. Isnt it possible that AM4 can remain unscheduled (starved) until AM1/AM2 or AM3 completes ? Basically containers started by AM1/2 and 3 might start and end, but until an application itself completes, AM4 will not be scheduled.. right ? With Fair Scheduler, cluster can logjam when there are too many queues -- Key: YARN-3633 URL: https://issues.apache.org/jira/browse/YARN-3633 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Rohit Agarwal Assignee: Rohit Agarwal Priority: Critical Attachments: YARN-3633-1.patch, YARN-3633.patch It's possible to logjam a cluster by submitting many applications at once in different queues. For example, let's say there is a cluster with 20GB of total memory. Let's say 4 users submit applications at the same time. The fair share of each queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the cluster logjams. Nothing gets scheduled even when 20GB of resources are available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues
[ https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609278#comment-14609278 ] Arun Suresh commented on YARN-3633: --- I guess the line that was introduced needs to be synchronized (guess we need to do the same for {{removeApp}} where we are subtracting).. given that you are adding/subtracting from totalAmResourceUsage defined in the {{FairScheduler}}.. and considering that the {{Resources#addTo/subtractFrom}} actually performs a get and set (and the value can change in between if some other AM is added/removed.. possibly during a concurrently running continuous scheduling attempt) With Fair Scheduler, cluster can logjam when there are too many queues -- Key: YARN-3633 URL: https://issues.apache.org/jira/browse/YARN-3633 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Rohit Agarwal Assignee: Rohit Agarwal Priority: Critical Attachments: YARN-3633-1.patch, YARN-3633.patch It's possible to logjam a cluster by submitting many applications at once in different queues. For example, let's say there is a cluster with 20GB of total memory. Let's say 4 users submit applications at the same time. The fair share of each queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the cluster logjams. Nothing gets scheduled even when 20GB of resources are available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3633) With Fair Scheduler, cluster can logjam when there are too many queues
[ https://issues.apache.org/jira/browse/YARN-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14609175#comment-14609175 ] Arun Suresh commented on YARN-3633: --- Agreed.. So just so that Im on the same page, the {{clusterMaxAMShare}} is essentially acting as an upper limit.. right ? Can we have the default as negative, to preserve current behavior by default ? Otherwise, the patch looks good.. although I feel in the {{addAMResourceUsage}}, I think we should synchronize the part where we add the {{amResource}} to the scheduler's total am resource usage field. +1 after that. With Fair Scheduler, cluster can logjam when there are too many queues -- Key: YARN-3633 URL: https://issues.apache.org/jira/browse/YARN-3633 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Rohit Agarwal Assignee: Rohit Agarwal Priority: Critical Attachments: YARN-3633-1.patch, YARN-3633.patch It's possible to logjam a cluster by submitting many applications at once in different queues. For example, let's say there is a cluster with 20GB of total memory. Let's say 4 users submit applications at the same time. The fair share of each queue is 5GB. Let's say that maxAMShare is 0.5. So, each queue has at most 2.5GB memory for AMs. If all the users requested AMs of size 3GB - the cluster logjams. Nothing gets scheduled even when 20GB of resources are available. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3920) FairScheduler Reserving a node for a container should be configurable to allow it used only for large containers
[ https://issues.apache.org/jira/browse/YARN-3920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647045#comment-14647045 ] Arun Suresh commented on YARN-3920: --- Thanks for the patch [~adhoot], The patch looks pretty straight forward to me and the test case looks good. My only minor comment is, Maybe we can expose this as an absolute value, rather than a ratio and the {{isReservable()}} function will just take min(ReservationThreshold, MaxCapability). I am ok either way though +1 pending above decision FairScheduler Reserving a node for a container should be configurable to allow it used only for large containers Key: YARN-3920 URL: https://issues.apache.org/jira/browse/YARN-3920 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: yARN-3920.001.patch Reserving a node for a container was designed for preventing large containers from starvation from small requests that keep getting into a node. Today we let this be used even for a small container request. This has a huge impact on scheduling since we block other scheduling requests until that reservation is fulfilled. We should make this configurable so its impact can be minimized by limiting it for large container requests as originally intended. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648488#comment-14648488 ] Arun Suresh commented on YARN-2005: --- Thanks for the patch [~adhoot], Couple of comments: # noBlacklist in DisabledBlacklistManager can be made static final. # {{getNumClusterHosts()}} in AbstractYarnScheduler : Any reason we are creating a new set ? think returning this.nodes.size() should suffice right ? # Instead of removing from the shared blacklist cause problems if the shared blacklist already contained the blacklisted node ? Blacklisting support for scheduling AMs --- Key: YARN-2005 URL: https://issues.apache.org/jira/browse/YARN-2005 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe Assignee: Anubhav Dhoot Attachments: YARN-2005.001.patch, YARN-2005.002.patch, YARN-2005.003.patch, YARN-2005.004.patch It would be nice if the RM supported blacklisting a node for an AM launch after the same node fails a configurable number of AM attempts. This would be similar to the blacklisting support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on the RM side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3736) Add RMStateStore apis to store and load accepted reservations for failover
[ https://issues.apache.org/jira/browse/YARN-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648831#comment-14648831 ] Arun Suresh commented on YARN-3736: --- The patch looks mostly good [~adhoot], Some minor comments : # Should'nt we update the versionInfo from (1, 2) since we are actually modifying the state store API ? # What would be the order of objects stored in the Reservation System root ? Should we not handle garbage collection of non active plans / reservations ? Think we should be careful of not running into situations like YARN-2962 .. which incidentally is not too much of an issue since we can actually limit to number of active/running applications. Add RMStateStore apis to store and load accepted reservations for failover -- Key: YARN-3736 URL: https://issues.apache.org/jira/browse/YARN-3736 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, fairscheduler, resourcemanager Reporter: Subru Krishnan Assignee: Anubhav Dhoot Attachments: YARN-3736.001.patch, YARN-3736.001.patch, YARN-3736.002.patch, YARN-3736.003.patch We need to persist the current state of the plan, i.e. the accepted ReservationAllocations corresponding RLESpareseResourceAllocations to the RMStateStore so that we can recover them on RM failover. This involves making all the reservation system data structures protobuf friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3736) Add RMStateStore apis to store and load accepted reservations for failover
[ https://issues.apache.org/jira/browse/YARN-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658765#comment-14658765 ] Arun Suresh commented on YARN-3736: --- Looks good. Thanks for the patch [~adhoot] and the reviews [~subru] Will be committing this shortly.. Add RMStateStore apis to store and load accepted reservations for failover -- Key: YARN-3736 URL: https://issues.apache.org/jira/browse/YARN-3736 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, fairscheduler, resourcemanager Reporter: Subru Krishnan Assignee: Anubhav Dhoot Attachments: YARN-3736.001.patch, YARN-3736.001.patch, YARN-3736.002.patch, YARN-3736.003.patch, YARN-3736.004.patch, YARN-3736.005.patch We need to persist the current state of the plan, i.e. the accepted ReservationAllocations corresponding RLESpareseResourceAllocations to the RMStateStore so that we can recover them on RM failover. This involves making all the reservation system data structures protobuf friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3736) Add RMStateStore apis to store and load accepted reservations for failover
[ https://issues.apache.org/jira/browse/YARN-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658773#comment-14658773 ] Arun Suresh commented on YARN-3736: --- Committed to trunk and branch-2 Add RMStateStore apis to store and load accepted reservations for failover -- Key: YARN-3736 URL: https://issues.apache.org/jira/browse/YARN-3736 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, fairscheduler, resourcemanager Reporter: Subru Krishnan Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: YARN-3736.001.patch, YARN-3736.001.patch, YARN-3736.002.patch, YARN-3736.003.patch, YARN-3736.004.patch, YARN-3736.005.patch We need to persist the current state of the plan, i.e. the accepted ReservationAllocations corresponding RLESpareseResourceAllocations to the RMStateStore so that we can recover them on RM failover. This involves making all the reservation system data structures protobuf friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3961) Expose queue container information (pending, running, reserved) in REST api and yarn top
[ https://issues.apache.org/jira/browse/YARN-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3961: -- Attachment: YARN-3961.001.patch Re-uploading last patch to kick jenkins Expose queue container information (pending, running, reserved) in REST api and yarn top Key: YARN-3961 URL: https://issues.apache.org/jira/browse/YARN-3961 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, webapp Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: Screen Shot 2015-07-22 at 6.17.38 PM.png, Screen Shot 2015-07-22 at 6.19.31 PM.png, Screen Shot 2015-07-22 at 6.28.05 PM.png, YARN-3961.001.patch, YARN-3961.001.patch It would be nice to expose container (allocated, pending, reserved) information in the rest API and in yarn top tool -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4012) Add support more multiple NM types in the SLS
Arun Suresh created YARN-4012: - Summary: Add support more multiple NM types in the SLS Key: YARN-4012 URL: https://issues.apache.org/jira/browse/YARN-4012 Project: Hadoop YARN Issue Type: Bug Components: scheduler-load-simulator Reporter: Arun Suresh Currently the SLS allows users to configure only 1 type of NM. This JIRA proposes configuring multiple pools of different NM configurations (wrt memory and vcores). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-4012) Add support more multiple NM types in the SLS
[ https://issues.apache.org/jira/browse/YARN-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-4012: - Assignee: Arun Suresh Add support more multiple NM types in the SLS - Key: YARN-4012 URL: https://issues.apache.org/jira/browse/YARN-4012 Project: Hadoop YARN Issue Type: Bug Components: scheduler-load-simulator Reporter: Arun Suresh Assignee: Arun Suresh Currently the SLS allows users to configure only 1 type of NM. This JIRA proposes configuring multiple pools of different NM configurations (wrt memory and vcores). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4012) Add support for multiple NM types in the SLS
[ https://issues.apache.org/jira/browse/YARN-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4012: -- Summary: Add support for multiple NM types in the SLS (was: Add support more multiple NM types in the SLS) Add support for multiple NM types in the SLS Key: YARN-4012 URL: https://issues.apache.org/jira/browse/YARN-4012 Project: Hadoop YARN Issue Type: Bug Components: scheduler-load-simulator Reporter: Arun Suresh Assignee: Arun Suresh Currently the SLS allows users to configure only 1 type of NM. This JIRA proposes configuring multiple pools of different NM configurations (wrt memory and vcores). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3961) Expose pending, running and reserved containers of a queue in REST api and yarn top
[ https://issues.apache.org/jira/browse/YARN-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3961: -- Affects Version/s: 2.7.2 Expose pending, running and reserved containers of a queue in REST api and yarn top --- Key: YARN-3961 URL: https://issues.apache.org/jira/browse/YARN-3961 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, webapp Affects Versions: 2.7.2 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: Screen Shot 2015-07-22 at 6.17.38 PM.png, Screen Shot 2015-07-22 at 6.19.31 PM.png, Screen Shot 2015-07-22 at 6.28.05 PM.png, YARN-3961.001.patch, YARN-3961.001.patch, YARN-3961.002.patch It would be nice to expose container (allocated, pending, reserved) information in the rest API and in yarn top tool -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3961) Expose pending, running and reserved containers of a queue in REST api and yarn top
[ https://issues.apache.org/jira/browse/YARN-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659551#comment-14659551 ] Arun Suresh commented on YARN-3961: --- +1, Thanks for the patch [~adhoot] Committing this shortly.. Expose pending, running and reserved containers of a queue in REST api and yarn top --- Key: YARN-3961 URL: https://issues.apache.org/jira/browse/YARN-3961 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, webapp Affects Versions: 2.7.2 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: Screen Shot 2015-07-22 at 6.17.38 PM.png, Screen Shot 2015-07-22 at 6.19.31 PM.png, Screen Shot 2015-07-22 at 6.28.05 PM.png, YARN-3961.001.patch, YARN-3961.001.patch, YARN-3961.002.patch It would be nice to expose container (allocated, pending, reserved) information in the rest API and in yarn top tool -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3961) Expose pending, running and reserved containers of a queue in REST api and yarn top
[ https://issues.apache.org/jira/browse/YARN-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3961: -- Summary: Expose pending, running and reserved containers of a queue in REST api and yarn top (was: Expose queue container information (pending, running, reserved) in REST api and yarn top) Expose pending, running and reserved containers of a queue in REST api and yarn top --- Key: YARN-3961 URL: https://issues.apache.org/jira/browse/YARN-3961 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, webapp Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: Screen Shot 2015-07-22 at 6.17.38 PM.png, Screen Shot 2015-07-22 at 6.19.31 PM.png, Screen Shot 2015-07-22 at 6.28.05 PM.png, YARN-3961.001.patch, YARN-3961.001.patch, YARN-3961.002.patch It would be nice to expose container (allocated, pending, reserved) information in the rest API and in yarn top tool -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3176) In Fair Scheduler, child queue should inherit maxApp from its parent
[ https://issues.apache.org/jira/browse/YARN-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654130#comment-14654130 ] Arun Suresh commented on YARN-3176: --- Thanks for the patch [~l201514], The patch itself looks fine. But, currently, I see that quite a lot of queue properties do not inherit from the parent. for eg. Min(Max)Resources, Preemption Timeouts etc. Should we broaden the scope of the JIRA to include these as well ? Also, I was also thinking, is it right simply inherit the maxApp ? The queue in question can hog all the apps and not leave its siblings any. Should we use the queue share to determine the max apps ? Thoughts ? In Fair Scheduler, child queue should inherit maxApp from its parent Key: YARN-3176 URL: https://issues.apache.org/jira/browse/YARN-3176 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Attachments: YARN-3176.v1.patch, YARN-3176.v2.patch if the child queue does not have a maxRunningApp limit, it will use the queueMaxAppsDefault. This behavior is not quite right, since queueMaxAppsDefault is normally a small number, whereas some parent queues do have maxRunningApp set to be more than the default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3736) Add RMStateStore apis to store and load accepted reservations for failover
[ https://issues.apache.org/jira/browse/YARN-3736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654252#comment-14654252 ] Arun Suresh commented on YARN-3736: --- The FindBugs link seems to be incorrect. The correct link : https://builds.apache.org/job/PreCommit-YARN-Build/8755/artifact/patchprocess/patchFindbugsWarningshadoop-yarn-server-resourcemanager.html +1 pending addressing it.. Add RMStateStore apis to store and load accepted reservations for failover -- Key: YARN-3736 URL: https://issues.apache.org/jira/browse/YARN-3736 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, fairscheduler, resourcemanager Reporter: Subru Krishnan Assignee: Anubhav Dhoot Attachments: YARN-3736.001.patch, YARN-3736.001.patch, YARN-3736.002.patch, YARN-3736.003.patch, YARN-3736.004.patch We need to persist the current state of the plan, i.e. the accepted ReservationAllocations corresponding RLESpareseResourceAllocations to the RMStateStore so that we can recover them on RM failover. This involves making all the reservation system data structures protobuf friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3961) Expose pending, running and reserved containers of a queue in REST api and yarn top
[ https://issues.apache.org/jira/browse/YARN-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3961: -- Fix Version/s: 2.8.0 Expose pending, running and reserved containers of a queue in REST api and yarn top --- Key: YARN-3961 URL: https://issues.apache.org/jira/browse/YARN-3961 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, webapp Affects Versions: 2.7.2 Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.8.0 Attachments: Screen Shot 2015-07-22 at 6.17.38 PM.png, Screen Shot 2015-07-22 at 6.19.31 PM.png, Screen Shot 2015-07-22 at 6.28.05 PM.png, YARN-3961.001.patch, YARN-3961.001.patch, YARN-3961.002.patch It would be nice to expose container (allocated, pending, reserved) information in the rest API and in yarn top tool -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3926) Extend the YARN resource model for easier resource-type management and profiles
[ https://issues.apache.org/jira/browse/YARN-3926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641865#comment-14641865 ] Arun Suresh commented on YARN-3926: --- Thanks for the proposal [~vvasudev] !! Interesting stuff.. Couple of comments from my first read of the proposal : # Instead of Resource.newInstance(MapResourceTypeInformation, Long), can we use the builder pattern something like so : {noformat} ResourceBuilder.dimension(ResourceTypeInformation t1).value(Long v1) .dimension(ResourceTypeInformation t2).value(Long v2) …. .create(); {noformat} # The proposal states that if there is a mismatch between what the resource-types.xml” contains and what the NM reports, it should shut down. My opinion is that node shut-down should happen only if Node reports less number of types / does not have all “enabled” types in resource-types.xml : same rationale as why nodes should not care if the resource type is “enabled” or not. If node reports more types, that dimension is just ignored. Also, in the section where you talk about adding/removing types, you mentioned that the NM should be upgraded first.. in which case it will start reporting a new type of resource.. and it should be accepted by the RM. # Instead of having to explicitly mark a resource as “countable”, can’t we just assume thats the default and instead require “uncountable” types to be explicitly specified (once we start supporting it) # I really like the Profiles idea… In the profile Section, do we really need a separate “yarn.scheduler.profile…name” ? can’t we just set “yarn.scheduler.profiles” to be “minimum,maximum,default,small,large” etc ? Extend the YARN resource model for easier resource-type management and profiles --- Key: YARN-3926 URL: https://issues.apache.org/jira/browse/YARN-3926 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, resourcemanager Reporter: Varun Vasudev Assignee: Varun Vasudev Attachments: Proposal for modifying resource model and profiles.pdf Currently, there are efforts to add support for various resource-types such as disk(YARN-2139), network(YARN-2140), and HDFS bandwidth(YARN-2681). These efforts all aim to add support for a new resource type and are fairly involved efforts. In addition, once support is added, it becomes harder for users to specify the resources they need. All existing jobs have to be modified, or have to use the minimum allocation. This ticket is a proposal to extend the YARN resource model to a more flexible model which makes it easier to support additional resource-types. It also considers the related aspect of “resource profiles” which allow users to easily specify the various resources they need for any given container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2154) FairScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request
[ https://issues.apache.org/jira/browse/YARN-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-2154: -- Attachment: YARN-2154.1.patch Attaching a proof-of-concept patch. The patch introduces an extra stage in the current preemption logic. Salient points to note : # In the first stage, we iterate through starved LeafQueues and obtain an aggregate of the {{ResourceDeficit}}, which has two fields: ## unmarked resources : The deficit for which the queue is starved for, essentially, no app can be allocated to the queue due to the deficit ## marked resources : These are app specific deficits, viz. node specific resources that an app is waiting on to launch a container. # In the second stage, we try to match the Marked resources obtained in the first step with containers owned by apps that are consuming above their fair/min share. If we find such a container, ## we first see if any app is already reserved on the Node hosting the container. ## If no, we Reserve the app originating the resource Request on the Node ## we then place the container in the {{warnedContainers}} list ## we return the totalResources that we reclaimed # In the last stage, we call {{preemptResources}} as before.. with the unmarked resources + the reclaimed resources in the previous stage. At which time, the {{warnedContainers}} list will be iterated over and containers will be killed. TODO: # The Matching can happen more efficiently. In the current patch, all first matching container that fits a resourcereq is targeted for preemption. This can probably be modified to a best fit algorithm # Fixing test cases. FairScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request -- Key: YARN-2154 URL: https://issues.apache.org/jira/browse/YARN-2154 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Arun Suresh Priority: Critical Attachments: YARN-2154.1.patch Today, FairScheduler uses a spray-gun approach to preemption. Instead, it should only preempt resources that would satisfy the incoming request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629253#comment-14629253 ] Arun Suresh commented on YARN-3535: --- The patch looks good !! Thanks for working on this [~peng.zhang] and [~rohithsharma] +1, Pending successful jenkins run with latest patch ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629249#comment-14629249 ] Arun Suresh commented on YARN-3535: --- I meant for the FairScheduler... but looks like your new patch has it... thanks ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14629394#comment-14629394 ] Arun Suresh commented on YARN-3535: --- bq. I think recoverResourceRequest will not be affected by whether container finished event is processed faster. Cause recoverResourceRequest only process the ResourceRequest in container and not care its status. I agree with [~peng.zhang] here. IIUC, The {{recoverResourceRequest}} only affects state of the Scheduler and the SchedulerApp. In any case, the fact that the container is killed (the outcome of the {{RMAppAttemptContainerFinishedEvent}} fired by {{FinishedTransition#transition}}) will be notified to the Scheduler.. and that notification will happen only AFTER the recoverResourceRequest has completed.. since it will be handled by the same dispatcher. ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627370#comment-14627370 ] Arun Suresh commented on YARN-3535: --- Apologies for the late suggestion. [~djp], Correct me if I am wrong here.. I was just looking at YARN-2561. It looks like the basic point of it was to ensure that on a reconnecting node, running containers were properly killed. This is achieved by the node removed and node added event. This happens in the {{if (noRunningApps) ..}} clause of the YARN-2561 patch. But I also see that a later patch has also handled the issue by introducing the following code inside the {{else ..}} clause of the above mentioned if. {noformat} for (ApplicationId appId : reconnectEvent.getRunningApplications()) { handleRunningAppOnNode(rmNode, rmNode.context, appId, rmNode.nodeId); } {noformat} This correctly kills only the running contains and does not do anything to the allocated containers (which I guess should be the case). Given the above, do we still need whatever is contained in the if clause ? wouldn't removing the if clause just solve this ? Thoughts ? ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3656) LowCost: A Cost-Based Placement Agent for YARN Reservations
[ https://issues.apache.org/jira/browse/YARN-3656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627669#comment-14627669 ] Arun Suresh commented on YARN-3656: --- [~imenache], Yup.. makes sense I guess a possible future improvement (probably beyond the scope of this JIRA) would be to allow declarative configuration of the Planner (via some xml etc. to allow users to string together different PAs). With respect to the algorithm and its implementation, it looks good to me. Minor observation: in {{StageAllocatorLowCostAligned}}, if you extract out how you are creating your {{durationIntervalsSortedByCost}} set (lines 93 - 122) into a separate function. you can probably have different implementations of {{StageAllocatorLowCost}} (Exhastive / Sample / Aligned) and plug it in via some configuration. +1 otherwise LowCost: A Cost-Based Placement Agent for YARN Reservations --- Key: YARN-3656 URL: https://issues.apache.org/jira/browse/YARN-3656 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, resourcemanager Affects Versions: 2.6.0 Reporter: Ishai Menache Assignee: Jonathan Yaniv Labels: capacity-scheduler, resourcemanager Attachments: LowCostRayonExternal.pdf, YARN-3656-v1.1.patch, YARN-3656-v1.2.patch, YARN-3656-v1.patch, lowcostrayonexternal_v2.pdf YARN-1051 enables SLA support by allowing users to reserve cluster capacity ahead of time. YARN-1710 introduced a greedy agent for placing user reservations. The greedy agent makes fast placement decisions but at the cost of ignoring the cluster committed resources, which might result in blocking the cluster resources for certain periods of time, and in turn rejecting some arriving jobs. We propose LowCost – a new cost-based planning algorithm. LowCost “spreads” the demand of the job throughout the allowed time-window according to a global, load-based cost function. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627581#comment-14627581 ] Arun Suresh commented on YARN-3535: --- bq. .. then on NM restart, running containers should be killed which is currently achieved by if-clause. I am probably missing something... but It looks like this is in fact being done in the else clause. (the code snippet I pasted in my comment [above|https://issues.apache.org/jira/browse/YARN-3535?focusedCommentId=14627370page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14627370]. lines 658 - 660 of RMNodeImpl in trunk). ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14627602#comment-14627602 ] Arun Suresh commented on YARN-3535: --- makes sense... thanks for clarifying.. ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628688#comment-14628688 ] Arun Suresh commented on YARN-3535: --- bq. This jira fix 2. Kill Container event in CS. So removing recoverResourceRequestForContainer(cont); is make sense to me.. Any reason why we don't remove {{recoverResourceRequestForContainer}} from the {{warnOrKillContainer}} method in the FairSheduler ? wont the above situation happen in the FS as well.. ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14628722#comment-14628722 ] Arun Suresh commented on YARN-3535: --- Also... Is it possible to simulate the 2 cases in the testcase ? ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3535: -- Component/s: resourcemanager fairscheduler capacityscheduler ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, fairscheduler, resourcemanager Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Fix For: 2.8.0 Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, 0006-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3535: -- Fix Version/s: 2.8.0 ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, fairscheduler, resourcemanager Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Fix For: 2.8.0 Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, 0006-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631205#comment-14631205 ] Arun Suresh commented on YARN-3535: --- +1, Committing this shortly. Thanks to everyone involved. ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, fairscheduler, resourcemanager Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Fix For: 2.8.0 Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, 0006-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3535) Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3535: -- Summary: Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED (was: ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED) Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED --- Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, fairscheduler, resourcemanager Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Fix For: 2.8.0 Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, 0006-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
[ https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3453: -- Fix Version/s: 2.8.0 Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing --- Key: YARN-3453 URL: https://issues.apache.org/jira/browse/YARN-3453 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Ashwin Shankar Assignee: Arun Suresh Fix For: 2.8.0 Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, YARN-3453.4.patch, YARN-3453.5.patch There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode. Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator. Following are the two places : 1. {code:title=FSLeafQueue.java|borderStyle=solid} private boolean isStarved(Resource share) {code} A queue shouldn’t be marked as “starved” if the dominant resource usage is = fair/minshare. 2. {code:title=FairScheduler.java|borderStyle=solid} protected Resource resToPreempt(FSLeafQueue sched, long curTime) {code} -- One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3535) Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3535: -- Fix Version/s: (was: 2.8.0) 2.7.2 Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED --- Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, fairscheduler, resourcemanager Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Fix For: 2.7.2 Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, 0006-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3535) Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3535: -- Target Version/s: 2.7.2 (was: 2.8.0) Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED --- Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, fairscheduler, resourcemanager Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Fix For: 2.7.2 Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, 0006-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632518#comment-14632518 ] Arun Suresh commented on YARN-3535: --- [~jlowe], yup.. ill check it into the 2.7 branch as well... Scheduler must re-request container resources when RMContainer transitions from ALLOCATED to KILLED --- Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, fairscheduler, resourcemanager Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Fix For: 2.8.0 Attachments: 0003-YARN-3535.patch, 0004-YARN-3535.patch, 0005-YARN-3535.patch, 0006-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
[ https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625019#comment-14625019 ] Arun Suresh commented on YARN-3453: --- Thanks for the reviews [~kasha], [~ashwinshankar77] and [~peng.zhang] Will be committing this shortly.. Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing --- Key: YARN-3453 URL: https://issues.apache.org/jira/browse/YARN-3453 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Ashwin Shankar Assignee: Arun Suresh Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, YARN-3453.4.patch, YARN-3453.5.patch There are two places in preemption code flow where DefaultResourceCalculator is used, even in DRF mode. Which basically results in more resources getting preempted than needed, and those extra preempted containers aren’t even getting to the “starved” queue since scheduling logic is based on DRF's Calculator. Following are the two places : 1. {code:title=FSLeafQueue.java|borderStyle=solid} private boolean isStarved(Resource share) {code} A queue shouldn’t be marked as “starved” if the dominant resource usage is = fair/minshare. 2. {code:title=FairScheduler.java|borderStyle=solid} protected Resource resToPreempt(FSLeafQueue sched, long curTime) {code} -- One more thing that I believe needs to change in DRF mode is : during a preemption round,if preempting a few containers results in satisfying needs of a resource type, then we should exit that preemption round, since the containers that we just preempted should bring the dominant resource usage to min/fair share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3535) ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED
[ https://issues.apache.org/jira/browse/YARN-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14625238#comment-14625238 ] Arun Suresh commented on YARN-3535: --- Thanks for working on this [~peng.zhang]. We seem to be hitting this on our scale clusters as well.. so would be good to get this in soon. In our case the NM re-registration was caused by YARN-3842 The Patch looks good to me. Any idea why the tests failed ? ResourceRequest should be restored back to scheduler when RMContainer is killed at ALLOCATED - Key: YARN-3535 URL: https://issues.apache.org/jira/browse/YARN-3535 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Critical Attachments: 0003-YARN-3535.patch, YARN-3535-001.patch, YARN-3535-002.patch, syslog.tgz, yarn-app.log During rolling update of NM, AM start of container on NM failed. And then job hang there. Attach AM logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2154) FairScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request
[ https://issues.apache.org/jira/browse/YARN-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14710324#comment-14710324 ] Arun Suresh commented on YARN-2154: --- Thanks for going thru the patch [~adhoot], [~kasha] and [~bpodgursky], bq. ..The previous ordering is better since if you happen to choose something just above its fairShare, after preemption it may go below and cause additional preemption, causing excessive thrashing of resources. This will not happen, as the current patch has a check to only preempt from an app, a container above its fair/min share. Am still working on the unit tests.. FairScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request -- Key: YARN-2154 URL: https://issues.apache.org/jira/browse/YARN-2154 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Arun Suresh Priority: Critical Attachments: YARN-2154.1.patch Today, FairScheduler uses a spray-gun approach to preemption. Instead, it should only preempt resources that would satisfy the incoming request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3738) Add support for recovery of reserved apps (running under dynamic queues) to Capacity Scheduler
[ https://issues.apache.org/jira/browse/YARN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14973046#comment-14973046 ] Arun Suresh commented on YARN-3738: --- I agree that not much can be done about the checkstyle warning and the tests seem to run fine locally for me as well. Committing this shortly.. > Add support for recovery of reserved apps (running under dynamic queues) to > Capacity Scheduler > -- > > Key: YARN-3738 > URL: https://issues.apache.org/jira/browse/YARN-3738 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > Attachments: YARN-3738-v2.patch, YARN-3738-v3.patch, > YARN-3738-v3.patch, YARN-3738-v4.patch, YARN-3738-v4.patch, YARN-3738.patch > > > YARN-3736 persists the current state of the Plan to the RMStateStore. This > JIRA covers recovery of the Plan, i.e. dynamic reservation queues with > associated apps as part Capacity Scheduler failover mechanism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3738) Add support for recovery of reserved apps running under dynamic queues
[ https://issues.apache.org/jira/browse/YARN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-3738: -- Summary: Add support for recovery of reserved apps running under dynamic queues (was: Add support for recovery of reserved apps (running under dynamic queues) to Capacity Scheduler) > Add support for recovery of reserved apps running under dynamic queues > -- > > Key: YARN-3738 > URL: https://issues.apache.org/jira/browse/YARN-3738 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > Attachments: YARN-3738-v2.patch, YARN-3738-v3.patch, > YARN-3738-v3.patch, YARN-3738-v4.patch, YARN-3738-v4.patch, YARN-3738.patch > > > YARN-3736 persists the current state of the Plan to the RMStateStore. This > JIRA covers recovery of the Plan, i.e. dynamic reservation queues with > associated apps as part Capacity Scheduler failover mechanism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3737) Add support for recovery of reserved apps (running under dynamic queues) to Fair Scheduler
[ https://issues.apache.org/jira/browse/YARN-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh resolved YARN-3737. --- Resolution: Done Fix Version/s: 2.8.0 > Add support for recovery of reserved apps (running under dynamic queues) to > Fair Scheduler > -- > > Key: YARN-3737 > URL: https://issues.apache.org/jira/browse/YARN-3737 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > Fix For: 2.8.0 > > > YARN-3736 persists the current state of the Plan to the RMStateStore. This > JIRA covers recovery of the Plan, i.e. dynamic reservation queues with > associated apps as part Fair Scheduler failover mechanism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3738) Add support for recovery of reserved apps running under dynamic queues
[ https://issues.apache.org/jira/browse/YARN-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14973052#comment-14973052 ] Arun Suresh commented on YARN-3738: --- Committed this to trunk and branch-2, Thanks for the patch [~subru] and for the review [~adhoot] !! > Add support for recovery of reserved apps running under dynamic queues > -- > > Key: YARN-3738 > URL: https://issues.apache.org/jira/browse/YARN-3738 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, resourcemanager >Reporter: Subru Krishnan >Assignee: Subru Krishnan > Fix For: 2.8.0 > > Attachments: YARN-3738-v2.patch, YARN-3738-v3.patch, > YARN-3738-v3.patch, YARN-3738-v4.patch, YARN-3738-v4.patch, YARN-3738.patch > > > YARN-3736 persists the current state of the Plan to the RMStateStore. This > JIRA covers recovery of the Plan, i.e. dynamic reservation queues with > associated apps as part Capacity Scheduler failover mechanism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4310) FairScheduler: Log skipping reservation messages at DEBUG level
[ https://issues.apache.org/jira/browse/YARN-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4310: -- Fix Version/s: 2.8.0 > FairScheduler: Log skipping reservation messages at DEBUG level > --- > > Key: YARN-4310 > URL: https://issues.apache.org/jira/browse/YARN-4310 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Minor > Fix For: 2.8.0 > > Attachments: YARN-4310.1.patch, YARN-4310.2.patch > > > YARN-4270 introduced an additional log message that is currently at INFO > level : > {noformat} > 2015-10-28 11:13:21,692 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttemt: > Reservation Exceeds Allowed number of nodes: > app_id=application_1446045371769_0001 existingReservations=1 > totalAvailableNodes=4 reservableNodesRatio=0.05 > numAllowedReservations=1 > {noformat} > It has been observed that the log message can totally swap the RM log file. > This needs to be reduced to DEBUG level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4310) Reduce log level for certain messages to prevent overrunning log file
Arun Suresh created YARN-4310: - Summary: Reduce log level for certain messages to prevent overrunning log file Key: YARN-4310 URL: https://issues.apache.org/jira/browse/YARN-4310 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Arun Suresh Assignee: Arun Suresh Priority: Minor YARN-4270 introduced an additional log message that is currently at INFO level : {noformat} 2015-10-28 11:13:21,692 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttemt: Reservation Exceeds Allowed number of nodes: app_id=application_1446045371769_0001 existingReservations=1 totalAvailableNodes=4 reservableNodesRatio=0.05 numAllowedReservations=1 {noformat} It has been observed that the log message can totally swap the RM log file. This needs to be reduced to DEBUG level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4310) Reduce log level for certain messages to prevent overrunning log file
[ https://issues.apache.org/jira/browse/YARN-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4310: -- Attachment: YARN-4310.1.patch Attaching trivial patch > Reduce log level for certain messages to prevent overrunning log file > - > > Key: YARN-4310 > URL: https://issues.apache.org/jira/browse/YARN-4310 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Minor > Attachments: YARN-4310.1.patch > > > YARN-4270 introduced an additional log message that is currently at INFO > level : > {noformat} > 2015-10-28 11:13:21,692 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttemt: > Reservation Exceeds Allowed number of nodes: > app_id=application_1446045371769_0001 existingReservations=1 > totalAvailableNodes=4 reservableNodesRatio=0.05 > numAllowedReservations=1 > {noformat} > It has been observed that the log message can totally swap the RM log file. > This needs to be reduced to DEBUG level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4310) FairScheduler: Log skipping reservation messages at DEBUG level
[ https://issues.apache.org/jira/browse/YARN-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4310: -- Attachment: YARN-4310.2.patch [~kasha] thanks for the review... updating patch > FairScheduler: Log skipping reservation messages at DEBUG level > --- > > Key: YARN-4310 > URL: https://issues.apache.org/jira/browse/YARN-4310 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Arun Suresh >Assignee: Arun Suresh >Priority: Minor > Attachments: YARN-4310.1.patch, YARN-4310.2.patch > > > YARN-4270 introduced an additional log message that is currently at INFO > level : > {noformat} > 2015-10-28 11:13:21,692 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttemt: > Reservation Exceeds Allowed number of nodes: > app_id=application_1446045371769_0001 existingReservations=1 > totalAvailableNodes=4 reservableNodesRatio=0.05 > numAllowedReservations=1 > {noformat} > It has been observed that the log message can totally swap the RM log file. > This needs to be reduced to DEBUG level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4184) Remove update reservation state api from state store as its not used by ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15002881#comment-15002881 ] Arun Suresh commented on YARN-4184: --- Thanks for the patch [~seanpo03]. it looks fine.. +1 Will commit by EOD > Remove update reservation state api from state store as its not used by > ReservationSystem > - > > Key: YARN-4184 > URL: https://issues.apache.org/jira/browse/YARN-4184 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Anubhav Dhoot >Assignee: Sean Po > Attachments: YARN-4184.v1.patch > > > ReservationSystem uses remove/add for updates and thus update api in state > store is not needed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2885) LocalRM: distributed scheduling decisions for queueable containers
[ https://issues.apache.org/jira/browse/YARN-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh reassigned YARN-2885: - Assignee: Arun Suresh > LocalRM: distributed scheduling decisions for queueable containers > -- > > Key: YARN-2885 > URL: https://issues.apache.org/jira/browse/YARN-2885 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Arun Suresh > > We propose to add a Local ResourceManager (LocalRM) to the NM in order to > support distributed scheduling decisions. > Architecturally we leverage the RMProxy, introduced in YARN-2884. > The LocalRM makes distributed decisions for queuable containers requests. > Guaranteed-start requests are still handled by the central RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2882) Introducing container types
[ https://issues.apache.org/jira/browse/YARN-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14990881#comment-14990881 ] Arun Suresh commented on YARN-2882: --- Thanks for the patch [~kkaranasos]. The patch looks mostly good. Few minor nits : # I feel that instead of adding another *newInstance* method in *ResourceRequest* class, maybe we replace this with some sort of builder pattern. for eg : something like so : {noformat} ReseourceRequest req = new ResourceRequestBuilder().setPripority(pri).setHostName(hostname).setContainerType(QUEUEABLE)...build(); {noformat} (I understand.. this might impact other parts of the code, but I believe it would make it more extensible in the future.) # in the *yarn_protos.proto* file, can we add *container_type* after the *node_label_expression* field (I feel newer fields should come later) Also, looks like the patch does not apply correctly anymore, can you please rebase ? > Introducing container types > --- > > Key: YARN-2882 > URL: https://issues.apache.org/jira/browse/YARN-2882 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Attachments: yarn-2882.patch > > > This JIRA introduces the notion of container types. > We propose two initial types of containers: guaranteed-start and queueable > containers. > Guaranteed-start are the existing containers, which are allocated by the > central RM and are instantaneously started, once allocated. > Queueable is a new type of container, which allows containers to be queued in > the NM, thus their execution may be arbitrarily delayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4270) Limit application resource reservation on nodes for non-node/rack specific requests
[ https://issues.apache.org/jira/browse/YARN-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4270: -- Attachment: YARN-4270.5.patch Thanks for the review [~kasha] !! Updating patch with your suggestions. Will file a follow up for delaying reservations until locality relaxation has maxed out. > Limit application resource reservation on nodes for non-node/rack specific > requests > --- > > Key: YARN-4270 > URL: https://issues.apache.org/jira/browse/YARN-4270 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-4270.1.patch, YARN-4270.2.patch, YARN-4270.3.patch, > YARN-4270.4.patch, YARN-4270.5.patch > > > I has been noticed that for off-switch requests, the FairScheduler reserves > resources on all nodes. This could lead to the entire cluster being > unavailable for all other applications. > Ideally, the reservations should be on a configurable number of nodes, > default 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3985) Make ReservationSystem persist state using RMStateStore reservation APIs
[ https://issues.apache.org/jira/browse/YARN-3985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14964162#comment-14964162 ] Arun Suresh commented on YARN-3985: --- +1, pending jenkins > Make ReservationSystem persist state using RMStateStore reservation APIs > - > > Key: YARN-3985 > URL: https://issues.apache.org/jira/browse/YARN-3985 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-3985.001.patch, YARN-3985.002.patch, > YARN-3985.002.patch, YARN-3985.002.patch, YARN-3985.003.patch, > YARN-3985.004.patch > > > YARN-3736 adds the RMStateStore apis to store and load reservation state. > This jira adds the actual storing of state from ReservationSystem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4270) Limit application resource reservation on nodes for non-node/rack specific requests
[ https://issues.apache.org/jira/browse/YARN-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4270: -- Issue Type: Bug (was: Improvement) > Limit application resource reservation on nodes for non-node/rack specific > requests > --- > > Key: YARN-4270 > URL: https://issues.apache.org/jira/browse/YARN-4270 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.8.0 > > Attachments: YARN-4270.1.patch, YARN-4270.2.patch, YARN-4270.3.patch, > YARN-4270.4.patch, YARN-4270.5.patch > > > I has been noticed that for off-switch requests, the FairScheduler reserves > resources on all nodes. This could lead to the entire cluster being > unavailable for all other applications. > Ideally, the reservations should be on a configurable number of nodes, > default 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4270) Limit application resource reservation on nodes for non-node/rack specific requests
[ https://issues.apache.org/jira/browse/YARN-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4270: -- Issue Type: Improvement (was: Bug) > Limit application resource reservation on nodes for non-node/rack specific > requests > --- > > Key: YARN-4270 > URL: https://issues.apache.org/jira/browse/YARN-4270 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.8.0 > > Attachments: YARN-4270.1.patch, YARN-4270.2.patch, YARN-4270.3.patch, > YARN-4270.4.patch, YARN-4270.5.patch > > > I has been noticed that for off-switch requests, the FairScheduler reserves > resources on all nodes. This could lead to the entire cluster being > unavailable for all other applications. > Ideally, the reservations should be on a configurable number of nodes, > default 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4270) Limit application resource reservation on nodes for non-node/rack specific requests
[ https://issues.apache.org/jira/browse/YARN-4270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4270: -- Fix Version/s: 2.8.0 > Limit application resource reservation on nodes for non-node/rack specific > requests > --- > > Key: YARN-4270 > URL: https://issues.apache.org/jira/browse/YARN-4270 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.8.0 > > Attachments: YARN-4270.1.patch, YARN-4270.2.patch, YARN-4270.3.patch, > YARN-4270.4.patch, YARN-4270.5.patch > > > I has been noticed that for off-switch requests, the FairScheduler reserves > resources on all nodes. This could lead to the entire cluster being > unavailable for all other applications. > Ideally, the reservations should be on a configurable number of nodes, > default 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)