[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588818#comment-13588818 ] Bikas Saha commented on YARN-365: - Do we need to worry about there being overlap between the 2 lists. i.e. a newlyLaunchedContainer also got completed by the time the slow RM handled the NM updates? {code} + private synchronized void nodeUpdate(RMNode nm) { if (LOG.isDebugEnabled()) { LOG.debug(nodeUpdate: + nm + clusterResources: + clusterResource); } - -FiCaSchedulerNode node = getNode(nm.getNodeID()); +FiCaSchedulerNode node = getNode(nm.getNodeID()); +ListUpdatedContainerInfo containerInfoList = nm.pullContainerUpdates(); +ListContainerStatus newlyLaunchedContainers = new ArrayListContainerStatus(); +ListContainerStatus completedContainers = new ArrayListContainerStatus(); +for(UpdatedContainerInfo containerInfo : containerInfoList) { + newlyLaunchedContainers.addAll(containerInfo.getNewlyLaunchedContainers()); + completedContainers.addAll(containerInfo.getCompletedContainers()); +} + {code} Note than this problem (if it is a problem) exists regardless of this change because a container may start and complete within the NM heartbeat interval. However, chances of hitting it are low before this change because the heartbeat interval is short and so the RM never see a node update in which the same container both launches and completes. After this change, with a slow RM, this can easily happen, specially because we are simply concatenating both sub-lists. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Fix For: 2.0.4-beta Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13588861#comment-13588861 ] Xuan Gong commented on YARN-365: bq:Do we need to worry about there being overlap between the 2 lists. i.e. a newlyLaunchedContainer also got completed by the time the slow RM handled the NM updates? Thanks for the comments. I think we are fine here. The way to handle newlyLaunchedContainers is to submit a LAUNCHED event to RMContainerImpl, and RMContainerImpl will unregister(remove) this container from containerAllocationExpirer list. That is how we handle the newlyLaunchedContainers. It does not actually launch the container. Just tell the RM that this container is being used right now. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Fix For: 2.0.4-beta Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587000#comment-13587000 ] Hudson commented on YARN-365: - Integrated in Hadoop-Yarn-trunk #139 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/139/]) YARN-365. Change NM heartbeat handling to not generate a scheduler event on each heartbeat. (Contributed by Xuan Gong) (Revision 1450007) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450007 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/UpdatedContainerInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/NodeUpdateSchedulerEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Fix For: 2.0.4-beta Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587081#comment-13587081 ] Hudson commented on YARN-365: - Integrated in Hadoop-Hdfs-trunk #1328 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1328/]) YARN-365. Change NM heartbeat handling to not generate a scheduler event on each heartbeat. (Contributed by Xuan Gong) (Revision 1450007) Result = FAILURE sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450007 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/UpdatedContainerInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/NodeUpdateSchedulerEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Fix For: 2.0.4-beta Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13587139#comment-13587139 ] Hudson commented on YARN-365: - Integrated in Hadoop-Mapreduce-trunk #1356 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1356/]) YARN-365. Change NM heartbeat handling to not generate a scheduler event on each heartbeat. (Contributed by Xuan Gong) (Revision 1450007) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450007 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/UpdatedContainerInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/NodeUpdateSchedulerEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Fix For: 2.0.4-beta Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586278#comment-13586278 ] Siddharth Seth commented on YARN-365: - Almost there. Couple more fixes needed though. - nextHeartbeat is being accessed by multiple threads (scheduler and main dispatcher). Should be volatile. - There's a race in handling nextHeartBeat. In the StatusUpdateWhenHealthyTransition, nextHeartBeat should be set to false before generating the event for the scheduler. Otherwise, there's a race between the scheduler event being processed (and setting the value to true) and the value being set to false. - Are the updates required to MockNodes.java. They don't seem to be used anywhere. - testStatusChanged can be modified to check the value of the queue size before the EXPIRE event. i.e. validate that events are being queued correctly. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586571#comment-13586571 ] Hadoop QA commented on YARN-365: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570885/YARN-365.9.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/429//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/429//console This message is automatically generated. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586601#comment-13586601 ] Siddharth Seth commented on YARN-365: - +1. Committing this, and ignoring the jenkins -1 since the patch adds a timeout to each of the tests it has modified. Thanks Xuan! Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586603#comment-13586603 ] Siddharth Seth commented on YARN-365: - One minor refactor, for which I'm uploading a patch. Renaming getContainerUpdates to pullContainerUpdates to be consistent with similar interfaces elsewhere. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586635#comment-13586635 ] Hadoop QA commented on YARN-365: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570908/YARN-365.10.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/430//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/430//console This message is automatically generated. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586637#comment-13586637 ] Hadoop QA commented on YARN-365: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570908/YARN-365.10.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/431//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/431//console This message is automatically generated. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586650#comment-13586650 ] Hadoop QA commented on YARN-365: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570908/YARN-365.10.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/432//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/432//console This message is automatically generated. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13586700#comment-13586700 ] Hudson commented on YARN-365: - Integrated in Hadoop-trunk-Commit #3384 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3384/]) YARN-365. Change NM heartbeat handling to not generate a scheduler event on each heartbeat. (Contributed by Xuan Gong) (Revision 1450007) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1450007 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/UpdatedContainerInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/NodeUpdateSchedulerEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Fix For: 2.0.4-beta Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, YARN-365.9.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584896#comment-13584896 ] Hadoop QA commented on YARN-365: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12570570/YARN-365.8.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 one of tests included doesn't have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/422//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/422//console This message is automatically generated. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584906#comment-13584906 ] Xuan Gong commented on YARN-365: The one test file that did not include any timeout is MockNode.java file which is under the test directory Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13581701#comment-13581701 ] Siddharth Seth commented on YARN-365: - Xuan, Thanks for updating the patch. Comments on the latest patch. There's some formatting issues - exceeding the 80 width limit, spaces after commas, etc. Also, there's some formatting changes to code which is unrelated to the patch which should be avoided. - Don't think the RMNode internal counter for number of queued events should be exposed. Infact, it can be implemented as a boolean instead of an integer for now, which gets reset whenever the scheduler tries fetching the list of container updates. Additional interfaces can be introduced when this behaviour is changed in the future. - The nodeUpdateQueue should be cleared early, before sending out the NodeRemovedEvent - applies to StatusUpdateWhenHealthyTransition, DeactivateNodeTransition and ReconnectNodeTransition - getContainerInfoList can be renamed to getContainerUpdates - In the unit tests, TestRMNodeTransition.setup() can be simplified. Don't think the 'first' flag is required. Also, testExpiredContainer and testStatusChange need to be updated after the latest change to the patch. Each NM heartbeat should not generate an event for the Scheduler Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, YARN-365.5.patch, YARN-365.6.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate and event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13576208#comment-13576208 ] Siddharth Seth commented on YARN-365: - Xuan, took a quick look at the patch. I'm not sure why the NodeStatusUpdate event needs to be generated in all the additional cases. It may just be sufficient to drop the stored node updates - or even let them be processed by the event which is already in the queue. Also, within the schedulers - the events can be aggregated into a single list and processed, instead of processing them per heartbeat. Each NM heartbeat should not generate and event for the Scheduler - Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate and event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573592#comment-13573592 ] Thomas Graves commented on YARN-365: Sorry Sid. I missed your comments and a few important points yesterday upon quick review. By aggregating I meant the information in the heartbeat aggregated with all previous heartbeats for that single node and then handled all at once in a single pass by the scheduler before it tries to do any allocations. Really its the same as your comment (which I missed yesterday) scheduler should really be pulling everything available in the node being processed. I was originally thinking something along the lines of it having a single list for each completed and launched containers that it would just add to rather then having the queue of the individual completed and launched lists (one per heartbeat). But as long as the scheduler handles all the updates in the queue before it tries to schedule you get the same affect. I'll review the current patch in more detail. A few comments on the current patch: - we don't need to add an update to the queue if there were no changes - I don't think the current patch is handling all the updates in a single scheduler pass. Each NM heartbeat should not generate and event for the Scheduler - Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate and event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573727#comment-13573727 ] Xuan Gong commented on YARN-365: Thanks for the comment. Will add the aggregation part at next patch. Each NM heartbeat should not generate and event for the Scheduler - Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate and event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573244#comment-13573244 ] Siddharth Seth commented on YARN-365: - This isn't very different from configuring all nodes to have a higher heartbeat interval. With a high heartbeat interval, the NM would send a batch of updates over to the RM, and this heartbeat would trigger a scheduling pass. This change de-links RM scheduling passes from NM heartbeats. The NM can continue to provide node updates with a smaller interval, and the RM handles these, along with a scheduling pass, as and when it chooses to. In this particular case, the scheduler queue ends up with a single scheduling event per node - but will attempt a scheduling run only on the next heartbeat from that node. At a later point, the scheduling could be changed to be triggered by the arrival of a new application - or to just run in a tight loop. If the scheduler cannot keep up, it ends up scheduling as fast as it can - without node heartbeats affecting the queue size. Also, completed container information from heartbeats is processed earlier (instead of waiting for the event in the queue to be processed) - making each scheduler pass more efficient. bq. I can see cases where the all at once is actually worse as it will spend more time on a single heartbeat and potentially not get to other things in the queue like apps added as fast. The event should not be delayed more than the time required to complete one scheduling pass across all nodes. I don't think this will be much better in the case of a growing scheduler queue. bq. The only way I can see this being beneficial is if we can aggregate the heartbeats and have the scheduler process less. Do you mean somehow aggregating heartbeats across nodes ? This approach does aggregate heartbeats for a single node. Each NM heartbeat should not generate and event for the Scheduler - Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-365) Each NM heartbeat should not generate and event for the Scheduler
[ https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572249#comment-13572249 ] Siddharth Seth commented on YARN-365: - Xuan, I took a look at the patch. Some comments. The scheduler should really be pulling everything available in the node being processed. Pulling only a single element doesn't change things too much from what they are at the moment. The other schedulers will also need to be updated - since the heartbeat path is common for all of them, i.e. the FifoScheduler and FairScheduler. Also, some thought needs to be given to handling of cases where the node may have gone unhealthy etc. Digging into the patch, - Don't think RMNode should expose it's internal data structure via {{getNodeUpdateQueue}}. Instead, it should expose a method give back a List of ContainerUpdates. - Do we need an explicit setNextHeartBeat? Instead, the call to get container updates could be used for now. - NodeUpdateSchedulerEvent should be changed to remove the container information, instead of sending nulls. - Similarly for nodeUpdate in the CapacityScheduler - Rename UpdateContainerInfo to UpdatedContainerInfo The code does have some formatting issues - please take a look at http://wiki.apache.org/hadoop/HowToContribute for code formatting guidelines and other useful info. Also, could you please upload another doc with the latest approach, to stay in sync with the patch. Thanks! Each NM heartbeat should not generate and event for the Scheduler - Key: YARN-365 URL: https://issues.apache.org/jira/browse/YARN-365 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, scheduler Affects Versions: 0.23.5 Reporter: Siddharth Seth Assignee: Xuan Gong Attachments: Prototype2.txt, YARN-365.1.patch, YARN-365.2.patch Follow up from YARN-275 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira