[jira] [Commented] (YARN-3990) AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
[ https://issues.apache.org/jira/browse/YARN-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650344#comment-14650344 ] Hudson commented on YARN-3990: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2201 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2201/]) YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt (jlowe: rev 32e490b6c035487e99df30ce80366446fe09bd6c) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestNodesListManager.java AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected Key: YARN-3990 URL: https://issues.apache.org/jira/browse/YARN-3990 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Sharma K S Assignee: Bibin A Chundatt Priority: Critical Fix For: 2.7.2 Attachments: 0001-YARN-3990.patch, 0002-YARN-3990.patch, 0003-YARN-3990.patch Whenever node is added or removed, NodeListManager sends RMAppNodeUpdateEvent to all the applications that are in the rmcontext. But for finished/killed/failed applications it is not required to send these events. Additional check for wheather app is finished/killed/failed would minimizes the unnecessary events {code} public void handle(NodesListManagerEvent event) { RMNode eventNode = event.getNode(); switch (event.getType()) { case NODE_UNUSABLE: LOG.debug(eventNode + reported unusable); unusableRMNodesConcurrentSet.add(eventNode); for(RMApp app: rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_UNUSABLE)); } break; case NODE_USABLE: if (unusableRMNodesConcurrentSet.contains(eventNode)) { LOG.debug(eventNode + reported usable); unusableRMNodesConcurrentSet.remove(eventNode); } for (RMApp app : rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_USABLE)); } break; default: LOG.error(Ignoring invalid eventtype + event.getType()); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3990) AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
[ https://issues.apache.org/jira/browse/YARN-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650381#comment-14650381 ] Hudson commented on YARN-3990: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #271 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/271/]) YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt (jlowe: rev 32e490b6c035487e99df30ce80366446fe09bd6c) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestNodesListManager.java * hadoop-yarn-project/CHANGES.txt AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected Key: YARN-3990 URL: https://issues.apache.org/jira/browse/YARN-3990 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Sharma K S Assignee: Bibin A Chundatt Priority: Critical Fix For: 2.7.2 Attachments: 0001-YARN-3990.patch, 0002-YARN-3990.patch, 0003-YARN-3990.patch Whenever node is added or removed, NodeListManager sends RMAppNodeUpdateEvent to all the applications that are in the rmcontext. But for finished/killed/failed applications it is not required to send these events. Additional check for wheather app is finished/killed/failed would minimizes the unnecessary events {code} public void handle(NodesListManagerEvent event) { RMNode eventNode = event.getNode(); switch (event.getType()) { case NODE_UNUSABLE: LOG.debug(eventNode + reported unusable); unusableRMNodesConcurrentSet.add(eventNode); for(RMApp app: rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_UNUSABLE)); } break; case NODE_USABLE: if (unusableRMNodesConcurrentSet.contains(eventNode)) { LOG.debug(eventNode + reported usable); unusableRMNodesConcurrentSet.remove(eventNode); } for (RMApp app : rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_USABLE)); } break; default: LOG.error(Ignoring invalid eventtype + event.getType()); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3990) AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
[ https://issues.apache.org/jira/browse/YARN-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650354#comment-14650354 ] Hudson commented on YARN-3990: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #263 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/263/]) YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt (jlowe: rev 32e490b6c035487e99df30ce80366446fe09bd6c) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestNodesListManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected Key: YARN-3990 URL: https://issues.apache.org/jira/browse/YARN-3990 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Sharma K S Assignee: Bibin A Chundatt Priority: Critical Fix For: 2.7.2 Attachments: 0001-YARN-3990.patch, 0002-YARN-3990.patch, 0003-YARN-3990.patch Whenever node is added or removed, NodeListManager sends RMAppNodeUpdateEvent to all the applications that are in the rmcontext. But for finished/killed/failed applications it is not required to send these events. Additional check for wheather app is finished/killed/failed would minimizes the unnecessary events {code} public void handle(NodesListManagerEvent event) { RMNode eventNode = event.getNode(); switch (event.getType()) { case NODE_UNUSABLE: LOG.debug(eventNode + reported unusable); unusableRMNodesConcurrentSet.add(eventNode); for(RMApp app: rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_UNUSABLE)); } break; case NODE_USABLE: if (unusableRMNodesConcurrentSet.contains(eventNode)) { LOG.debug(eventNode + reported usable); unusableRMNodesConcurrentSet.remove(eventNode); } for (RMApp app : rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_USABLE)); } break; default: LOG.error(Ignoring invalid eventtype + event.getType()); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3814) REST API implementation for getting raw entities in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3814: --- Attachment: YARN-3814-YARN-2928.03.patch REST API implementation for getting raw entities in TimelineReader -- Key: YARN-3814 URL: https://issues.apache.org/jira/browse/YARN-3814 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-3814-YARN-2928.01.patch, YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, YARN-3814.reference.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3990) AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
[ https://issues.apache.org/jira/browse/YARN-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650272#comment-14650272 ] Hudson commented on YARN-3990: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1004 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1004/]) YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt (jlowe: rev 32e490b6c035487e99df30ce80366446fe09bd6c) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestNodesListManager.java AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected Key: YARN-3990 URL: https://issues.apache.org/jira/browse/YARN-3990 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Sharma K S Assignee: Bibin A Chundatt Priority: Critical Fix For: 2.7.2 Attachments: 0001-YARN-3990.patch, 0002-YARN-3990.patch, 0003-YARN-3990.patch Whenever node is added or removed, NodeListManager sends RMAppNodeUpdateEvent to all the applications that are in the rmcontext. But for finished/killed/failed applications it is not required to send these events. Additional check for wheather app is finished/killed/failed would minimizes the unnecessary events {code} public void handle(NodesListManagerEvent event) { RMNode eventNode = event.getNode(); switch (event.getType()) { case NODE_UNUSABLE: LOG.debug(eventNode + reported unusable); unusableRMNodesConcurrentSet.add(eventNode); for(RMApp app: rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_UNUSABLE)); } break; case NODE_USABLE: if (unusableRMNodesConcurrentSet.contains(eventNode)) { LOG.debug(eventNode + reported usable); unusableRMNodesConcurrentSet.remove(eventNode); } for (RMApp app : rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_USABLE)); } break; default: LOG.error(Ignoring invalid eventtype + event.getType()); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3990) AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
[ https://issues.apache.org/jira/browse/YARN-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650325#comment-14650325 ] Hudson commented on YARN-3990: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2220 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2220/]) YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt (jlowe: rev 32e490b6c035487e99df30ce80366446fe09bd6c) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestNodesListManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected Key: YARN-3990 URL: https://issues.apache.org/jira/browse/YARN-3990 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Sharma K S Assignee: Bibin A Chundatt Priority: Critical Fix For: 2.7.2 Attachments: 0001-YARN-3990.patch, 0002-YARN-3990.patch, 0003-YARN-3990.patch Whenever node is added or removed, NodeListManager sends RMAppNodeUpdateEvent to all the applications that are in the rmcontext. But for finished/killed/failed applications it is not required to send these events. Additional check for wheather app is finished/killed/failed would minimizes the unnecessary events {code} public void handle(NodesListManagerEvent event) { RMNode eventNode = event.getNode(); switch (event.getType()) { case NODE_UNUSABLE: LOG.debug(eventNode + reported unusable); unusableRMNodesConcurrentSet.add(eventNode); for(RMApp app: rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_UNUSABLE)); } break; case NODE_USABLE: if (unusableRMNodesConcurrentSet.contains(eventNode)) { LOG.debug(eventNode + reported usable); unusableRMNodesConcurrentSet.remove(eventNode); } for (RMApp app : rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_USABLE)); } break; default: LOG.error(Ignoring invalid eventtype + event.getType()); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup
[ https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650284#comment-14650284 ] Naganarasimha G R commented on YARN-2923: - Thanks [~leftnoteasy], for the feedback, but had few things to discuss. bq. 1) All script provider related configurations/logic should be removed. Will take care of it, earlier thought at-least white-list entities let it be there but not a problem will correct it. {quote} PreviousNodeLabels will be reset every time if we do a fetch. (To avoid handle same node labels as much as possible) Don't do check if new fetched node label is as same as previousNodeLabels. (Also, avoid handle same node label) {quote} May be I dint get your approach completely here ? but based on earlier discussions in 2495 we had finalized that we will send the node labels only when there is a change because # unnecessary Network traffic which is amplified in a large cluster which has lot of NM's # Either RM needs to do the validation of whether the labels are modified or blindly update the NodeLabelsStore. In former case is not better because RM will be overloaded to do this operation and in later case unnecessary Store file updates, i feel the magnitude of updates which will be happening in the large cluster with heartbeat coming in every second is too high and all are unwanted updates. Based on the above points i feel its better to check and update RM only when there is a change in the NM's node labels bq. Don't reset node label if fetched node label is incorrect. (This should be a part of error handling, we should treat it's a error to be avoided instead of force reset it) IIUC this was the conclusion we had in YARN-2495 in the following [comment|https://issues.apache.org/jira/browse/YARN-2495?focusedCommentId=14358109page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14358109]. which i feel is correct because assume a case where in label is based on java version(/lib version) and java version is upgraded in a given NM but the centralized labels new java version is not yet added then NM will fail to update the node labels but if we do not reset it, RM will maintain the NM's NodeLabel as of Prev java version and the client app if run on this node it might fail due to this. bq. A little cosmetic suggestion. Hmm agree on this but approach i felt we can have a inner class like {{NMDistributedNodeLabelsHandler}} and that can maintain the state of previous labels and takes care of initializing and taking care of all methods and logging related to NodeLabels . I have already added some methods but we require to maintain some state of labels in the heartbeat flow so introducing a class like this and pushing all methods related to labels there would be much better and more readable, Thoughts ? bq. I suggest to keep provider within nodemanager ok Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup - Key: YARN-2923 URL: https://issues.apache.org/jira/browse/YARN-2923 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Naganarasimha G R Assignee: Naganarasimha G R Fix For: 2.8.0 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, YARN-2923.20150517-1.patch As part of Distributed Node Labels configuration we need to support Node labels to be configured in Yarn-site.xml. And on modification of Node Labels configuration in yarn-site.xml, NM should be able to get modified Node labels from this NodeLabelsprovider service without NM restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3990) AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
[ https://issues.apache.org/jira/browse/YARN-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650285#comment-14650285 ] Hudson commented on YARN-3990: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #274 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/274/]) YARN-3990. AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected. Contributed by Bibin A Chundatt (jlowe: rev 32e490b6c035487e99df30ce80366446fe09bd6c) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestNodesListManager.java AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected Key: YARN-3990 URL: https://issues.apache.org/jira/browse/YARN-3990 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Sharma K S Assignee: Bibin A Chundatt Priority: Critical Fix For: 2.7.2 Attachments: 0001-YARN-3990.patch, 0002-YARN-3990.patch, 0003-YARN-3990.patch Whenever node is added or removed, NodeListManager sends RMAppNodeUpdateEvent to all the applications that are in the rmcontext. But for finished/killed/failed applications it is not required to send these events. Additional check for wheather app is finished/killed/failed would minimizes the unnecessary events {code} public void handle(NodesListManagerEvent event) { RMNode eventNode = event.getNode(); switch (event.getType()) { case NODE_UNUSABLE: LOG.debug(eventNode + reported unusable); unusableRMNodesConcurrentSet.add(eventNode); for(RMApp app: rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_UNUSABLE)); } break; case NODE_USABLE: if (unusableRMNodesConcurrentSet.contains(eventNode)) { LOG.debug(eventNode + reported usable); unusableRMNodesConcurrentSet.remove(eventNode); } for (RMApp app : rmContext.getRMApps().values()) { this.rmContext .getDispatcher() .getEventHandler() .handle( new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode, RMAppNodeUpdateType.NODE_USABLE)); } break; default: LOG.error(Ignoring invalid eventtype + event.getType()); } } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650295#comment-14650295 ] Hadoop QA commented on YARN-3814: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 10s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 3s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 4s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 16s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 15s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 26s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 42s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 50s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 41s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 41m 56s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12748305/YARN-3814-YARN-2928.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / df0ec47 | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8750/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8750/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8750/console | This message was automatically generated. REST API implementation for getting raw entities in TimelineReader -- Key: YARN-3814 URL: https://issues.apache.org/jira/browse/YARN-3814 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-3814-YARN-2928.01.patch, YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, YARN-3814.reference.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3025) Provide API for retrieving blacklisted nodes
[ https://issues.apache.org/jira/browse/YARN-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved YARN-3025. -- Resolution: Later Provide API for retrieving blacklisted nodes Key: YARN-3025 URL: https://issues.apache.org/jira/browse/YARN-3025 Project: Hadoop YARN Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu Attachments: yarn-3025-v1.txt, yarn-3025-v2.txt, yarn-3025-v3.txt We have the following method which updates blacklist: {code} public synchronized void updateBlacklist(ListString blacklistAdditions, ListString blacklistRemovals) { {code} Upon AM failover, there should be an API which returns the blacklisted nodes so that the new AM can make consistent decisions. The new API can be: {code} public synchronized ListString getBlacklistedNodes() {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650270#comment-14650270 ] Varun Saxena commented on YARN-3814: [~zjshen], [~sjlee0] updated patch as per Zhijie's comment. Cluster ID will now be part of path. Client side handling for cluster id to be done as part of same JIRA ? REST API implementation for getting raw entities in TimelineReader -- Key: YARN-3814 URL: https://issues.apache.org/jira/browse/YARN-3814 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-3814-YARN-2928.01.patch, YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, YARN-3814.reference.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4003) ReservationQueue inherit getAMResourceLimit() from LeafQueue, but behavior is not consistent
[ https://issues.apache.org/jira/browse/YARN-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650572#comment-14650572 ] Carlo Curino commented on YARN-4003: [~leftnoteasy], this sounds like a good proposal to tighten this limit a bit more. Talking with [~atumanov], however we spotted a possible corner case I want your opinion on. Given a cluster of 1000 machines, and a PlanQueue of 50%. We might have the following set of reservations: R1 and R2 both of size 250 at a certain time t_0, and R3 which has size 0 at t0 (and will grow for some t_i t_0). Formally the capacity of the PlanQueue (500 containers) is exhausted by R1, R2 and R3 has capacity=0, so your math would yield to amLimit for R3 of 0 (i.e., no app can be started). However, R1 and R2 might be using only a fraction of their reserved capacity, and we might thus waste some resources. In this scenario, I would probably prefer R3 to get started opportunistically (and if R1,R2 demand does not spike till t_i where R3 capacity grows to 0 we are golden). We could clearly construct other scenario in which letting the AM to start will only mean we need to preempted as R1,R2 spike. This is a balancing act of work preservation vs guaranteed execution. I am ok to resolve it in either direction, what's your vote? (Anyone else with opinions on this?) ReservationQueue inherit getAMResourceLimit() from LeafQueue, but behavior is not consistent Key: YARN-4003 URL: https://issues.apache.org/jira/browse/YARN-4003 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Carlo Curino Attachments: YARN-4003.patch The inherited behavior from LeafQueue (limit AM % based on capacity) is not a good fit for ReservationQueue (that have highly dynamic capacity). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3045) [Event producers] Implement NM writing container lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650231#comment-14650231 ] Naganarasimha G R commented on YARN-3045: - [~sjlee0], Thanks for the detailed explanation now i am clear about the plans for TimelineClient and indirect support of flush for it. bq. If these events are already associated with containers any way, they are not an issue, right? Well [~djp] had pointed for 2 new stuff as part of this jira(apart from the existing container life cycle events), one was NM side application events and other was container resource localization events. In the later case again (similar to NM Side Application event) multiple resource paths can be localized for a given container, so different states of localization cannot be directly put as event ID's as we need to also publish the information of which resource this event belongs too. hence had suggested earlier as : ??For Localization i feel it can be under ContainerEntity and the EventID can have Event Type (REQUEST,LOCALIZED,LOCALIZATION_FAILED)and PATH of the localized resource.?? [Event producers] Implement NM writing container lifecycle events to ATS Key: YARN-3045 URL: https://issues.apache.org/jira/browse/YARN-3045 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3045-YARN-2928.002.patch, YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch, YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, YARN-3045.20150420-1.patch Per design in YARN-2928, implement NM writing container lifecycle events and container system metrics to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14650243#comment-14650243 ] Varun Saxena commented on YARN-3814: [~zjshen], ok...Lets have it on the client side then. Or we can have 2 separate REST endpoints, with and without cluster ID. REST API implementation for getting raw entities in TimelineReader -- Key: YARN-3814 URL: https://issues.apache.org/jira/browse/YARN-3814 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-3814-YARN-2928.01.patch, YARN-3814-YARN-2928.02.patch, YARN-3814.reference.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)