[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365286#comment-14365286 ] Hudson commented on YARN-3197: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/]) YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * hadoop-yarn-project/CHANGES.txt Confusing log generated by CapacityScheduler Key: YARN-3197 URL: https://issues.apache.org/jira/browse/YARN-3197 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.6.0 Reporter: Hitesh Shah Assignee: Varun Saxena Priority: Minor Fix For: 2.8.0 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, YARN-3197.003.patch, YARN-3197.004.patch 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated
[ https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365291#comment-14365291 ] Hudson commented on YARN-2854: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/]) YARN-2854. Addendum patch to fix the minor issue in the timeline service documentation. (zjshen: rev ed4e72a20b75ffbd22deb0607dd8b94f6e437a84) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md The document about timeline service and generic service needs to be updated --- Key: YARN-2854 URL: https://issues.apache.org/jira/browse/YARN-2854 Project: Hadoop YARN Issue Type: Improvement Components: timelineserver Reporter: Zhijie Shen Assignee: Naganarasimha G R Priority: Critical Fix For: 2.7.0 Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, YARN-2854.20150128.1.patch, YARN-2854.20150304.1.patch, YARN-2854.20150311-1.patch, YARN-2854.20150313-1.patch, YARN-2854.20150314-1.patch, YARN-2854.20150314-1_branch2.patch, YARN-2854.20150315-1_trunk_addendum.patch, timeline_structure.jpg -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3349) Treat all exceptions as failure in TestFSRMStateStore#testFSRMStateStoreClientRetry
[ https://issues.apache.org/jira/browse/YARN-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365290#comment-14365290 ] Hudson commented on YARN-3349: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/]) YARN-3349. Treat all exceptions as failure in TestFSRMStateStore#testFSRMStateStoreClientRetry. Contributed by Zhihai Xu. (ozawa: rev 7522a643faeea2d8a8e2c7409ae60e0973e7cf38) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java * hadoop-yarn-project/CHANGES.txt Treat all exceptions as failure in TestFSRMStateStore#testFSRMStateStoreClientRetry --- Key: YARN-3349 URL: https://issues.apache.org/jira/browse/YARN-3349 Project: Hadoop YARN Issue Type: Improvement Components: test Affects Versions: 2.6.0 Reporter: zhihai xu Assignee: zhihai xu Priority: Minor Fix For: 2.7.0 Attachments: YARN-3349.000.patch treat all exceptions as failure in testFSRMStateStoreClientRetry. Currently the exception could only be replicated to 0 nodes instead of minReplication (=1) is not treated as failure in testFSRMStateStoreClientRetry. {code} // TODO 0 datanode exception will not be retried by dfs client, fix // that separately. if (!e.getMessage().contains(could only be replicated + to 0 nodes instead of minReplication (=1))) { assertionFailedInThread.set(true); } {code} With YARN-2820(Retry in FileSystemRMStateStore), we needn't treat this exception specially. We can remove the check and treat all exceptions as failure in testFSRMStateStoreClientRetry. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository
[ https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365295#comment-14365295 ] Hudson commented on YARN-3339: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #135 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/135/]) YARN-3339. TestDockerContainerExecutor should pull a single image and not the entire centos repository. (Ravindra Kumar Naik via raviprak) (raviprak: rev 56085203c43b8f2561bf3745910e03f8ac176a67) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java TestDockerContainerExecutor should pull a single image and not the entire centos repository --- Key: YARN-3339 URL: https://issues.apache.org/jira/browse/YARN-3339 Project: Hadoop YARN Issue Type: Test Components: test Affects Versions: 2.6.0 Environment: Linux Reporter: Ravindra Kumar Naik Priority: Minor Fix For: 2.8.0 Attachments: YARN-3339-branch-2.6.0.001.patch, YARN-3339-trunk.001.patch TestDockerContainerExecutor test pulls the entire centos repository which is time consuming. Pulling a specific image (e.g. centos7) will be sufficient to run the test successfully and will save time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository
[ https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365347#comment-14365347 ] Hudson commented on YARN-3339: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/]) YARN-3339. TestDockerContainerExecutor should pull a single image and not the entire centos repository. (Ravindra Kumar Naik via raviprak) (raviprak: rev 56085203c43b8f2561bf3745910e03f8ac176a67) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java * hadoop-yarn-project/CHANGES.txt TestDockerContainerExecutor should pull a single image and not the entire centos repository --- Key: YARN-3339 URL: https://issues.apache.org/jira/browse/YARN-3339 Project: Hadoop YARN Issue Type: Test Components: test Affects Versions: 2.6.0 Environment: Linux Reporter: Ravindra Kumar Naik Priority: Minor Fix For: 2.8.0 Attachments: YARN-3339-branch-2.6.0.001.patch, YARN-3339-trunk.001.patch TestDockerContainerExecutor test pulls the entire centos repository which is time consuming. Pulling a specific image (e.g. centos7) will be sufficient to run the test successfully and will save time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2854) The document about timeline service and generic service needs to be updated
[ https://issues.apache.org/jira/browse/YARN-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365343#comment-14365343 ] Hudson commented on YARN-2854: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/]) YARN-2854. Addendum patch to fix the minor issue in the timeline service documentation. (zjshen: rev ed4e72a20b75ffbd22deb0607dd8b94f6e437a84) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md The document about timeline service and generic service needs to be updated --- Key: YARN-2854 URL: https://issues.apache.org/jira/browse/YARN-2854 Project: Hadoop YARN Issue Type: Improvement Components: timelineserver Reporter: Zhijie Shen Assignee: Naganarasimha G R Priority: Critical Fix For: 2.7.0 Attachments: TimelineServer.html, YARN-2854.20141120-1.patch, YARN-2854.20150128.1.patch, YARN-2854.20150304.1.patch, YARN-2854.20150311-1.patch, YARN-2854.20150313-1.patch, YARN-2854.20150314-1.patch, YARN-2854.20150314-1_branch2.patch, YARN-2854.20150315-1_trunk_addendum.patch, timeline_structure.jpg -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365338#comment-14365338 ] Hudson commented on YARN-3197: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2085 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2085/]) YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * hadoop-yarn-project/CHANGES.txt Confusing log generated by CapacityScheduler Key: YARN-3197 URL: https://issues.apache.org/jira/browse/YARN-3197 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.6.0 Reporter: Hitesh Shah Assignee: Varun Saxena Priority: Minor Fix For: 2.8.0 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, YARN-3197.003.patch, YARN-3197.004.patch 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
[ https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366608#comment-14366608 ] Naganarasimha G R commented on YARN-3362: - Hi Wangda, Would like to work on this issue, hence have assigned to my name, if you have already started working on it please feel free to reassign. Add node label usage in RM CapacityScheduler web UI --- Key: YARN-3362 URL: https://issues.apache.org/jira/browse/YARN-3362 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager, webapp Reporter: Wangda Tan Assignee: Naganarasimha G R We don't have node label usage in RM CapacityScheduler web UI now, without this, user will be hard to understand what happened to nodes have labels assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
[ https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R reassigned YARN-3362: --- Assignee: Naganarasimha G R Add node label usage in RM CapacityScheduler web UI --- Key: YARN-3362 URL: https://issues.apache.org/jira/browse/YARN-3362 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager, webapp Reporter: Wangda Tan Assignee: Naganarasimha G R We don't have node label usage in RM CapacityScheduler web UI now, without this, user will be hard to understand what happened to nodes have labels assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366614#comment-14366614 ] Li Lu commented on YARN-3040: - Quick comment: In my understanding the flow based API is used in multiple components, including but not limited to event producers (like distributed shell, rm and nms), collectors (a.k.a aggregators), and storage implementations. It's not specially attached to the rm. [Data Model] Implement client-side API for handling flows - Key: YARN-3040 URL: https://issues.apache.org/jira/browse/YARN-3040 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Robert Kanter Per design in YARN-2928, implement client-side API for handling *flows*. Frameworks should be able to define and pass in all attributes of flows and flow runs to YARN, and they should be passed into ATS writers. YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page
[ https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366621#comment-14366621 ] Peng Zhang commented on YARN-3111: -- Thanks for your advices For 4 proposal listed above: 1 2 are already done in the patch 3 is good, but one question is that parent queue has no tooltip now, but it has its own bar. And think over 3 4, what about listing all resources's usage percent on the text on the right of each bar? Maybe color red for dominant resource? or just judge it by comparing percent number? And also what do you think of the issue I mentioned above? I think it still can happen after 1 2, cause for one queue: steady, fair, max, usage resource may have different dominant resource type. If I make a mistake here, please let me know. bq. queue's bar width is decided by (queue steady resource / cluster resource), and queue's usage width is decided by (queue's usage resource / cluster resource). For above two percent computation, dominant resource may be different, so two percent value is still in different dimension, and it causes confusion. Fix ratio problem on FairScheduler page --- Key: YARN-3111 URL: https://issues.apache.org/jira/browse/YARN-3111 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Minor Attachments: YARN-3111.1.patch, YARN-3111.png Found 3 problems on FairScheduler page: 1. Only compute memory for ratio even when queue schedulingPolicy is DRF. 2. When min resources is configured larger than real resources, the steady fair share ratio is so long that it is out the page. 3. When cluster resources is 0(no nodemanager start), ratio is displayed as NaN% used Attached image shows the snapshot of above problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366654#comment-14366654 ] Zhijie Shen commented on YARN-3040: --- [~Naganarasimha], thanks for being interested in this issue. I've already had a WIP patch. If you don't mind, may I continue the work, and would you please help to review it? [Data Model] Implement client-side API for handling flows - Key: YARN-3040 URL: https://issues.apache.org/jira/browse/YARN-3040 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Robert Kanter Per design in YARN-2928, implement client-side API for handling *flows*. Frameworks should be able to define and pass in all attributes of flows and flow runs to YARN, and they should be passed into ATS writers. YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen resolved YARN-3039. --- Resolution: Fixed Fix Version/s: YARN-2928 Hadoop Flags: Reviewed Committed the patch to branch YARN-2928. Thanks for the patch, Junping! Thanks for review, Sangjin! [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Fix For: YARN-2928 Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366593#comment-14366593 ] Rohith commented on YARN-3273: -- I am pretty confused with jenkins report , report says hadoop-yarn-server-common failed tests but at console log for this project says no failure!! {{Tests run: 19, Failures: 0, Errors: 0, Skipped: 0}} Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366595#comment-14366595 ] Rohith commented on YARN-3273: -- TestAMRestart failure is unrelated to this patch Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
[ https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366616#comment-14366616 ] Wangda Tan commented on YARN-3362: -- It's yours :). Looking forward your patch. Thanks, Add node label usage in RM CapacityScheduler web UI --- Key: YARN-3362 URL: https://issues.apache.org/jira/browse/YARN-3362 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager, webapp Reporter: Wangda Tan Assignee: Naganarasimha G R We don't have node label usage in RM CapacityScheduler web UI now, without this, user will be hard to understand what happened to nodes have labels assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366587#comment-14366587 ] Naganarasimha G R commented on YARN-3040: - Hi [~rkanter] and [~zjshen], Seems like scope of this jira is small and i need to make use of this in YARN-3044, so if both of you are ok would like to take this jira up. [Data Model] Implement client-side API for handling flows - Key: YARN-3040 URL: https://issues.apache.org/jira/browse/YARN-3040 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Robert Kanter Per design in YARN-2928, implement client-side API for handling *flows*. Frameworks should be able to define and pass in all attributes of flows and flow runs to YARN, and they should be passed into ATS writers. YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366636#comment-14366636 ] Hudson commented on YARN-3273: -- SUCCESS: Integrated in Hadoop-trunk-Commit #7355 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7355/]) YARN-3273. Improve scheduler UI to facilitate scheduling analysis and debugging. Contributed Rohith Sharmaks (jianhe: rev 658097d6da1b1aac8e01db459f0c3b456e99652f) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestContinuousScheduling.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppAttemptBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/CapacitySchedulerLeafQueueInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/SchedulerInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptMetrics.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/MetricsOverviewTable.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/UserInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Fix For: 2.8.0 Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons
[jira] [Updated] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-3273: - Attachment: 0003-YARN-3273.patch Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3341) Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource
[ https://issues.apache.org/jira/browse/YARN-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364786#comment-14364786 ] Brahma Reddy Battula commented on YARN-3341: It is dupe of YARN-3204,, can you please have look into YARN-3204..? Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource --- Key: YARN-3341 URL: https://issues.apache.org/jira/browse/YARN-3341 Project: Hadoop YARN Issue Type: Improvement Reporter: zhihai xu Assignee: zhihai xu Priority: Minor Labels: findbugs Attachments: YARN-3341.000.patch, YARN-3341.001.patch Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource The warning message is {code} Unchecked/unconfirmed cast from org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.reserveResource(SchedulerApplicationAttempt, Priority, RMContainer) {code} The code which cause the warning is {code} this.reservedAppSchedulable = (FSAppAttempt) application; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3358) Audit log not present while refreshing Service ACLs'
Varun Saxena created YARN-3358: -- Summary: Audit log not present while refreshing Service ACLs' Key: YARN-3358 URL: https://issues.apache.org/jira/browse/YARN-3358 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Priority: Minor There should be a success audit log in AdminService#refreshServiceAcls -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364762#comment-14364762 ] Hadoop QA commented on YARN-3273: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705019/0003-YARN-3273.patch against trunk revision ef9946c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes org.apache.hadoop.yarn.server.resourcemanager.security.TestAMRMTokens org.apache.hadoop.yarn.server.resourcemanager.TestMoveApplication org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerQueueACLs org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6995//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6995//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6995//console This message is automatically generated. Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler
[ https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3241: Attachment: (was: YARN-3241.000.patch) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler Key: YARN-3241 URL: https://issues.apache.org/jira/browse/YARN-3241 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3241.000.patch Leading space, trailing space and empty sub queue name may cause MetricsException(Metrics source XXX already exists! ) when add application to FairScheduler. The reason is because QueueMetrics parse the queue name different from the QueueManager. QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space and trailing space in the sub queue name, It will also remove empty sub queue name. {code} static final Splitter Q_SPLITTER = Splitter.on('.').omitEmptyStrings().trimResults(); {code} But QueueManager won't remove Leading space, trailing space and empty sub queue name. This will cause out of sync between FSQueue and FSQueueMetrics. QueueManager will think two queue names are different so it will try to create a new queue. But FSQueueMetrics will treat these two queue names as same queue which will create Metrics source XXX already exists! MetricsException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364660#comment-14364660 ] Hadoop QA commented on YARN-3273: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705000/0003-YARN-3273.patch against trunk revision 046521c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.webapp.TestNodesPage Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6994//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6994//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6994//console This message is automatically generated. Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler
[ https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3241: Attachment: YARN-3241.000.patch Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler Key: YARN-3241 URL: https://issues.apache.org/jira/browse/YARN-3241 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3241.000.patch Leading space, trailing space and empty sub queue name may cause MetricsException(Metrics source XXX already exists! ) when add application to FairScheduler. The reason is because QueueMetrics parse the queue name different from the QueueManager. QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space and trailing space in the sub queue name, It will also remove empty sub queue name. {code} static final Splitter Q_SPLITTER = Splitter.on('.').omitEmptyStrings().trimResults(); {code} But QueueManager won't remove Leading space, trailing space and empty sub queue name. This will cause out of sync between FSQueue and FSQueueMetrics. QueueManager will think two queue names are different so it will try to create a new queue. But FSQueueMetrics will treat these two queue names as same queue which will create Metrics source XXX already exists! MetricsException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler
[ https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3241: Attachment: (was: YARN-3241.000.patch) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler Key: YARN-3241 URL: https://issues.apache.org/jira/browse/YARN-3241 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: zhihai xu Assignee: zhihai xu Leading space, trailing space and empty sub queue name may cause MetricsException(Metrics source XXX already exists! ) when add application to FairScheduler. The reason is because QueueMetrics parse the queue name different from the QueueManager. QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space and trailing space in the sub queue name, It will also remove empty sub queue name. {code} static final Splitter Q_SPLITTER = Splitter.on('.').omitEmptyStrings().trimResults(); {code} But QueueManager won't remove Leading space, trailing space and empty sub queue name. This will cause out of sync between FSQueue and FSQueueMetrics. QueueManager will think two queue names are different so it will try to create a new queue. But FSQueueMetrics will treat these two queue names as same queue which will create Metrics source XXX already exists! MetricsException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository
[ https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364662#comment-14364662 ] Ravindra Kumar Naik commented on YARN-3339: --- Thanks for the information. TestDockerContainerExecutor should pull a single image and not the entire centos repository --- Key: YARN-3339 URL: https://issues.apache.org/jira/browse/YARN-3339 Project: Hadoop YARN Issue Type: Test Components: test Affects Versions: 2.6.0 Environment: Linux Reporter: Ravindra Kumar Naik Priority: Minor Fix For: 2.8.0 Attachments: YARN-3339-branch-2.6.0.001.patch, YARN-3339-trunk.001.patch TestDockerContainerExecutor test pulls the entire centos repository which is time consuming. Pulling a specific image (e.g. centos7) will be sufficient to run the test successfully and will save time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler
[ https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3241: Attachment: YARN-3241.000.patch Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler Key: YARN-3241 URL: https://issues.apache.org/jira/browse/YARN-3241 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3241.000.patch Leading space, trailing space and empty sub queue name may cause MetricsException(Metrics source XXX already exists! ) when add application to FairScheduler. The reason is because QueueMetrics parse the queue name different from the QueueManager. QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space and trailing space in the sub queue name, It will also remove empty sub queue name. {code} static final Splitter Q_SPLITTER = Splitter.on('.').omitEmptyStrings().trimResults(); {code} But QueueManager won't remove Leading space, trailing space and empty sub queue name. This will cause out of sync between FSQueue and FSQueueMetrics. QueueManager will think two queue names are different so it will try to create a new queue. But FSQueueMetrics will treat these two queue names as same queue which will create Metrics source XXX already exists! MetricsException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues
[ https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365972#comment-14365972 ] Hudson commented on YARN-3181: -- FAILURE: Integrated in Hadoop-trunk-Commit #7351 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7351/]) Revert YARN-3181. FairScheduler: Fix up outdated findbugs issues. (kasha) (kasha: rev 32b43304563c2430c00bc3e142a962d2bc5f4d58) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSOpDurations.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/CHANGES.txt FairScheduler: Fix up outdated findbugs issues -- Key: YARN-3181 URL: https://issues.apache.org/jira/browse/YARN-3181 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-3181-1.patch In FairScheduler, we have excluded some findbugs-reported errors. Some of them aren't applicable anymore, and there are a few that can be easily fixed without needing an exclusion. It would be nice to fix them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page
[ https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366002#comment-14366002 ] Ashwin Shankar commented on YARN-3111: -- Sounds good to me. Fix ratio problem on FairScheduler page --- Key: YARN-3111 URL: https://issues.apache.org/jira/browse/YARN-3111 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Minor Attachments: YARN-3111.1.patch, YARN-3111.png Found 3 problems on FairScheduler page: 1. Only compute memory for ratio even when queue schedulingPolicy is DRF. 2. When min resources is configured larger than real resources, the steady fair share ratio is so long that it is out the page. 3. When cluster resources is 0(no nodemanager start), ratio is displayed as NaN% used Attached image shows the snapshot of above problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366106#comment-14366106 ] Hadoop QA commented on YARN-3273: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705019/0003-YARN-3273.patch against trunk revision 968425e. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7004//console This message is automatically generated. Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3361) CapacityScheduler side changes to support non-exclusive node labels
[ https://issues.apache.org/jira/browse/YARN-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3361: - Component/s: (was: api) (was: client) (was: resourcemanager) capacityscheduler CapacityScheduler side changes to support non-exclusive node labels --- Key: YARN-3361 URL: https://issues.apache.org/jira/browse/YARN-3361 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Reporter: Wangda Tan Assignee: Wangda Tan According to design doc attached in YARN-3214, we need implement following logic in CapacityScheduler: 1) When allocate a resource request with no node-label specified, it should get preferentially allocated to node without labels. 2) When there're some available resource in a node with label, they can be used by applications with following order: - Applications under queues which can access the label and ask for same labeled resource. - Applications under queues which can access the label and ask for non-labeled resource. - Applications under queues cannot access the label and ask for non-labeled resource. 3) Expose necessary information that can be used by preemption policy to make preemption decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3361) CapacityScheduler side changes to support non-exclusive node labels
[ https://issues.apache.org/jira/browse/YARN-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3361: - Description: According to design doc attached in YARN-3214, we need implement following logic in CapacityScheduler: 1) When allocate a resource request with no node-label specified, it should get preferentially allocated to node without labels. 2) When there're some available resource in a node with label, they can be used by applications with following order: - Applications under queues which can access the label and ask for same labeled resource. - Applications under queues which can access the label and ask for non-labeled resource. - Applications under queues cannot access the label and ask for non-labeled resource. 3) Expose necessary information that can be used by preemption policy to make preemption decisions. was:Reference to design doc attached in YARN-3214, this is CapacityScheduler side changes to support non-exclusive node labels. CapacityScheduler side changes to support non-exclusive node labels --- Key: YARN-3361 URL: https://issues.apache.org/jira/browse/YARN-3361 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan According to design doc attached in YARN-3214, we need implement following logic in CapacityScheduler: 1) When allocate a resource request with no node-label specified, it should get preferentially allocated to node without labels. 2) When there're some available resource in a node with label, they can be used by applications with following order: - Applications under queues which can access the label and ask for same labeled resource. - Applications under queues which can access the label and ask for non-labeled resource. - Applications under queues cannot access the label and ask for non-labeled resource. 3) Expose necessary information that can be used by preemption policy to make preemption decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.
[ https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3356: - Description: According to design doc attached in YARN-3214, we need implement following logic in CapacityScheduler: 1) When allocate a resource request with no node-label specified, it should get preferentially allocated to node without labels. 2) When there're some available resource in a node with label, they can be used by applications with following order: - Applications under queues which can access the label and ask for same labeled resource. - Applications under queues which can access the label and ask for non-labeled resource. - Applications under queues cannot access the label and ask for non-labeled resource. 3) Expose necessary information that can be used by preemption policy to make preemption decisions. was: Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. -- Key: YARN-3356 URL: https://issues.apache.org/jira/browse/YARN-3356 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-3356.1.patch According to design doc attached in YARN-3214, we need implement following logic in CapacityScheduler: 1) When allocate a resource request with no node-label specified, it should get preferentially allocated to node without labels. 2) When there're some available resource in a node with label, they can be used by applications with following order: - Applications under queues which can access the label and ask for same labeled resource. - Applications under queues which can access the label and ask for non-labeled resource. - Applications under queues cannot access the label and ask for non-labeled resource. 3) Expose necessary information that can be used by preemption policy to make preemption decisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.
[ https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3356: - Description: Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. was: According to design doc attached in YARN-3214, we need implement following logic in CapacityScheduler: 1) When allocate a resource request with no node-label specified, it should get preferentially allocated to node without labels. 2) When there're some available resource in a node with label, they can be used by applications with following order: - Applications under queues which can access the label and ask for same labeled resource. - Applications under queues which can access the label and ask for non-labeled resource. - Applications under queues cannot access the label and ask for non-labeled resource. 3) Expose necessary information that can be used by preemption policy to make preemption decisions. Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. -- Key: YARN-3356 URL: https://issues.apache.org/jira/browse/YARN-3356 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-3356.1.patch Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-2556: -- Attachment: YARN-2556.2.patch Tool to measure the performance of the timeline server -- Key: YARN-2556 URL: https://issues.apache.org/jira/browse/YARN-2556 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Chang Li Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.1.patch, YARN-2556.2.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity. I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start. This could be done as an example or test job that could be tied into gridmix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3326) ReST support for getLabelsToNodes
[ https://issues.apache.org/jira/browse/YARN-3326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366120#comment-14366120 ] Naganarasimha G R commented on YARN-3326: - Hi [~vvasudev], [~wangda], I was not able to come up with anything better than /label-mappings?label=label1,label2,..., please inform if this is ok will modify the patch else please provide more options... P.S. /nodes is already used for getNodes ReST support for getLabelsToNodes -- Key: YARN-3326 URL: https://issues.apache.org/jira/browse/YARN-3326 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.6.0 Reporter: Naganarasimha G R Assignee: Naganarasimha G R Priority: Minor Attachments: YARN-3326.20150310-1.patch REST to support to retrieve LabelsToNodes Mapping -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366121#comment-14366121 ] Jian He commented on YARN-3273: --- patch actually applies, not sure why jenkins complains, re-submitting the same patch. Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3360) Add JMX metrics to TimelineDataManager
[ https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned YARN-3360: Assignee: Jason Lowe Add JMX metrics to TimelineDataManager -- Key: YARN-3360 URL: https://issues.apache.org/jira/browse/YARN-3360 Project: Hadoop YARN Issue Type: Improvement Components: timelineserver Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Jason Lowe The TimelineDataManager currently has no metrics, outside of the standard JVM metrics. It would be very useful to at least log basic counts of method calls, time spent in those calls, and number of entities/events involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3189) Yarn application usage command should not give -appstate and -apptype
[ https://issues.apache.org/jira/browse/YARN-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated YARN-3189: --- Fix Version/s: (was: 3.0.0) Yarn application usage command should not give -appstate and -apptype - Key: YARN-3189 URL: https://issues.apache.org/jira/browse/YARN-3189 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Anushri Assignee: Anushri Priority: Minor Attachments: YARN-3189.patch, YARN-3189.patch Yarn application usage command should not give -appstate and -apptype since these two are applicable to --list command.. *Can somebody please assign this issue to me* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.
[ https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366060#comment-14366060 ] Hadoop QA commented on YARN-3356: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705148/YARN-3356.1.patch against trunk revision d884670. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7001//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7001//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7001//console This message is automatically generated. Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. -- Key: YARN-3356 URL: https://issues.apache.org/jira/browse/YARN-3356 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-3356.1.patch Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3034) [Aggregator wireup] Implement RM starting its ATS writer
[ https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-3034: Attachment: YARN-3034.20150318-1.patch Hi [~djp], I have updated the patch with the yarn-default.xml updates, please review. [Aggregator wireup] Implement RM starting its ATS writer Key: YARN-3034 URL: https://issues.apache.org/jira/browse/YARN-3034 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch Per design in YARN-2928, implement resource managers starting their own ATS writers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3305) AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation
[ https://issues.apache.org/jira/browse/YARN-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366093#comment-14366093 ] Hudson commented on YARN-3305: -- FAILURE: Integrated in Hadoop-trunk-Commit #7352 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7352/]) YARN-3305. Normalize AM resource request on app submission. Contributed by Rohith Sharmaks (jianhe: rev 968425e9f7b850ff9c2ab8ca37a64c3fdbe77dbf) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java AM-Used Resource for leafqueue is wrongly populated if AM ResourceRequest is less than minimumAllocation Key: YARN-3305 URL: https://issues.apache.org/jira/browse/YARN-3305 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.6.0 Reporter: Rohith Assignee: Rohith Fix For: 2.8.0 Attachments: 0001-YARN-3305.patch, 0001-YARN-3305.patch, 0002-YARN-3305.patch, 0003-YARN-3305.patch For given any ResourceRequest, {{CS#allocate}} normalizes request to minimumAllocation if requested memory is less than minimumAllocation. But AM-used resource is updated with actual ResourceRequest made by user. This results in AM container allocation more than Max ApplicationMaster Resource. This is because AM-Used is updated with actual ResourceRequest made by user while activating the applications. But during allocation of container, ResourceRequest is normalized to minimumAllocation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3360) Add JMX metrics to TimelineDataManager
[ https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-3360: - Attachment: YARN-3360.001.patch This adds basic, store-independent, metrics to the TimelineDataManager to provide call counts, entity/event counts, and time-per-call averages. This also fixes a number of unit tests that weren't initializing the data manager properly. Add JMX metrics to TimelineDataManager -- Key: YARN-3360 URL: https://issues.apache.org/jira/browse/YARN-3360 Project: Hadoop YARN Issue Type: Improvement Components: timelineserver Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-3360.001.patch The TimelineDataManager currently has no metrics, outside of the standard JVM metrics. It would be very useful to at least log basic counts of method calls, time spent in those calls, and number of entities/events involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3361) CapacityScheduler side changes to support non-exclusive node labels
Wangda Tan created YARN-3361: Summary: CapacityScheduler side changes to support non-exclusive node labels Key: YARN-3361 URL: https://issues.apache.org/jira/browse/YARN-3361 Project: Hadoop YARN Issue Type: Sub-task Reporter: Wangda Tan Assignee: Wangda Tan Reference to design doc attached in YARN-3214, this is CapacityScheduler side changes to support non-exclusive node labels. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3345) Add non-exclusive node label RMAdmin CLI/API
[ https://issues.apache.org/jira/browse/YARN-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366119#comment-14366119 ] Hadoop QA commented on YARN-3345: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705175/YARN-3345.4.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7002//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7002//console This message is automatically generated. Add non-exclusive node label RMAdmin CLI/API Key: YARN-3345 URL: https://issues.apache.org/jira/browse/YARN-3345 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-3345.1.patch, YARN-3345.2.patch, YARN-3345.3.patch, YARN-3345.4.patch As described in YARN-3214 (see design doc attached to that JIRA), we need add non-exclusive node label RMAdmin API and CLI implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-3273: -- Attachment: 0004-YARN-3273.patch Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, 0004-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
Wangda Tan created YARN-3362: Summary: Add node label usage in RM CapacityScheduler web UI Key: YARN-3362 URL: https://issues.apache.org/jira/browse/YARN-3362 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager, webapp Reporter: Wangda Tan We don't have node label usage in RM CapacityScheduler web UI now, without this, user will be hard to understand what happened to nodes have labels assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI
[ https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366191#comment-14366191 ] Wangda Tan commented on YARN-3362: -- My proposal is: For now, RM CapacityScheduler UI looks like {code} + root [==] 50% used + a [===] 75% used - a1 [=] 30% used - | Queue Metrics Table | || | metrics1 |value1 | | metrics2 |value2 | | metrics3 |value3 | | metrics4 |value4 | -- + b [...] + c [...] {code} We can add one more hierarchy above queue's hierarchy, which are labels can be accessed and/or labels are being used by the queue, which can looks like {code} + label_x [=] 30% used + root [=] 30% used + a [===] 75% used + a1 [=] 30% used - | Queue Metrics Table (For label_x) | || | metrics1 |value1 | | metrics2 |value2 | | metrics3 |value3 | | metrics4 |value4 | -- + label_y + root [...] + ... + label_z + root [...] + ... + no_label + root [...] + ... {code} To make it backward compatible, when there's no label in the system, it will not show label-bar, and root is still root-queue. Please feel free to share your ideas on this! Add node label usage in RM CapacityScheduler web UI --- Key: YARN-3362 URL: https://issues.apache.org/jira/browse/YARN-3362 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager, webapp Reporter: Wangda Tan We don't have node label usage in RM CapacityScheduler web UI now, without this, user will be hard to understand what happened to nodes have labels assign to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3241) Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler
[ https://issues.apache.org/jira/browse/YARN-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364856#comment-14364856 ] Hadoop QA commented on YARN-3241: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705023/YARN-3241.000.patch against trunk revision 48c2db3. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 5 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6997//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6997//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6997//console This message is automatically generated. Leading space, trailing space and empty sub queue name may cause MetricsException for fair scheduler Key: YARN-3241 URL: https://issues.apache.org/jira/browse/YARN-3241 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3241.000.patch Leading space, trailing space and empty sub queue name may cause MetricsException(Metrics source XXX already exists! ) when add application to FairScheduler. The reason is because QueueMetrics parse the queue name different from the QueueManager. QueueMetrics use Q_SPLITTER to parse queue name, it will remove Leading space and trailing space in the sub queue name, It will also remove empty sub queue name. {code} static final Splitter Q_SPLITTER = Splitter.on('.').omitEmptyStrings().trimResults(); {code} But QueueManager won't remove Leading space, trailing space and empty sub queue name. This will cause out of sync between FSQueue and FSQueueMetrics. QueueManager will think two queue names are different so it will try to create a new queue. But FSQueueMetrics will treat these two queue names as same queue which will create Metrics source XXX already exists! MetricsException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3197) Confusing log generated by CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-3197: Hadoop Flags: Reviewed Confusing log generated by CapacityScheduler Key: YARN-3197 URL: https://issues.apache.org/jira/browse/YARN-3197 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.6.0 Reporter: Hitesh Shah Assignee: Varun Saxena Priority: Minor Attachments: YARN-3197.001.patch, YARN-3197.002.patch, YARN-3197.003.patch, YARN-3197.004.patch 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3197) Confusing log generated by CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-3197: Target Version/s: 2.8.0 (was: 2.7.0) +1, latest patch looks good to me, will commit it shortly. Confusing log generated by CapacityScheduler Key: YARN-3197 URL: https://issues.apache.org/jira/browse/YARN-3197 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.6.0 Reporter: Hitesh Shah Assignee: Varun Saxena Priority: Minor Attachments: YARN-3197.001.patch, YARN-3197.002.patch, YARN-3197.003.patch, YARN-3197.004.patch 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1453) [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments
[ https://issues.apache.org/jira/browse/YARN-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364917#comment-14364917 ] Hudson commented on YARN-1453: -- FAILURE: Integrated in Hadoop-Yarn-trunk #869 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/869/]) YARN-1453. [JDK8] Fix Javadoc errors caused by incorrect or illegal tags in doc comments. Contributed by Akira AJISAKA, Andrew Purtell, and Allen Wittenauer. (ozawa: rev 3da9a97cfbcc3a1c50aaf85b1a129d4d269cd5fd) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationClientProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ReservationRequest.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetContainerStatusesResponse.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/binding/RegistryUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeHealthStatus.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/ProxyUriUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerStatus.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/WebApps.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/FinishApplicationMasterResponse.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetClusterMetricsResponse.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/StartContainerRequest.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationSubmissionContext.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerReport.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry/src/main/java/org/apache/hadoop/registry/client/impl/RegistryOperationsClient.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetQueueInfoResponse.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/RegisterApplicationMasterResponse.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/PreemptionMessage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/StringHelper.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationBaseProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/security/ApplicationACLsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/NMTokenCache.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetQueueInfoRequest.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/QueueInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/AHSClient.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetApplicationsRequest.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ReservationRequestInterpreter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/AllocateRequest.java *
[jira] [Commented] (YARN-2777) Mark the end of individual log in aggregated log
[ https://issues.apache.org/jira/browse/YARN-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364860#comment-14364860 ] Varun Saxena commented on YARN-2777: [~tedyu], previous test failures were unrelated. Kindly review. Mark the end of individual log in aggregated log Key: YARN-2777 URL: https://issues.apache.org/jira/browse/YARN-2777 Project: Hadoop YARN Issue Type: Improvement Reporter: Ted Yu Assignee: Varun Saxena Labels: log-aggregation Attachments: YARN-2777.001.patch, YARN-2777.02.patch Below is snippet of aggregated log showing hbase master log: {code} LogType: hbase-hbase-master-ip-172-31-34-167.log LogUploadTime: 29-Oct-2014 22:31:55 LogLength: 24103045 Log Contents: Wed Oct 29 15:43:57 UTC 2014 Starting master on ip-172-31-34-167 ... at org.apache.hadoop.hbase.master.cleaner.CleanerChore.chore(CleanerChore.java:124) at org.apache.hadoop.hbase.Chore.run(Chore.java:80) at java.lang.Thread.run(Thread.java:745) LogType: hbase-hbase-master-ip-172-31-34-167.out {code} Since logs from various daemons are aggregated in one log file, it would be desirable to mark the end of one log before starting with the next. e.g. with such a line: {code} End of LogType: hbase-hbase-master-ip-172-31-34-167.log {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page
[ https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364815#comment-14364815 ] Peng Zhang commented on YARN-3111: -- I think overlay is not a good choice. Currently scheduler bar is already overlay of steady share, instantaneous share and max resources. Then overlaying two dimension of resources may generate 2 * 3 elements? If so, it should be too cluttered without new resources added. When test this patch in our cluster, I found a new issue with some abnormal configuration: queue's bar width is decided by (queue steady resource / cluster resource), and queue's usage width is decided by (queue's usage resource / cluster resource). For above two percent computation, dominant resource may be different, so two percent value is still in different dimension, and it causes confusion. To figure out above problem, we practice making queue's steady share proportional to root queue share in different resource dimension, so first percent value(queue steady resource / cluster resource) has the same percent value in different resources, and it will not cause confusion. I think deeper problem is that FS can configure cpu and memory seperately(eg: min resource, max resource ), and it makes resource not proportional between queues, but need a view of percentage. Fix ratio problem on FairScheduler page --- Key: YARN-3111 URL: https://issues.apache.org/jira/browse/YARN-3111 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Minor Attachments: YARN-3111.1.patch, YARN-3111.png Found 3 problems on FairScheduler page: 1. Only compute memory for ratio even when queue schedulingPolicy is DRF. 2. When min resources is configured larger than real resources, the steady fair share ratio is so long that it is out the page. 3. When cluster resources is 0(no nodemanager start), ratio is displayed as NaN% used Attached image shows the snapshot of above problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364898#comment-14364898 ] Varun Saxena commented on YARN-3197: Thanks [~devaraj.k] for the commit and review. [~leftnoteasy] and [~vinodkv], thanks for your comments. Confusing log generated by CapacityScheduler Key: YARN-3197 URL: https://issues.apache.org/jira/browse/YARN-3197 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.6.0 Reporter: Hitesh Shah Assignee: Varun Saxena Priority: Minor Fix For: 2.8.0 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, YARN-3197.003.patch, YARN-3197.004.patch 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3197) Confusing log generated by CapacityScheduler
[ https://issues.apache.org/jira/browse/YARN-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364906#comment-14364906 ] Hudson commented on YARN-3197: -- FAILURE: Integrated in Hadoop-trunk-Commit #7347 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7347/]) YARN-3197. Confusing log generated by CapacityScheduler. Contributed by (devaraj: rev 7179f94f9d000fc52bd9ce5aa9741aba97ec3ee8) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java Confusing log generated by CapacityScheduler Key: YARN-3197 URL: https://issues.apache.org/jira/browse/YARN-3197 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.6.0 Reporter: Hitesh Shah Assignee: Varun Saxena Priority: Minor Fix For: 2.8.0 Attachments: YARN-3197.001.patch, YARN-3197.002.patch, YARN-3197.003.patch, YARN-3197.004.patch 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:39,968 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... 2015-02-12 20:35:40,960 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Kumar Naik updated YARN-3344: -- Attachment: (was: YARN-3344-branch-2.6.0.001.patch) procfs stat file is not in the expected format warning -- Key: YARN-3344 URL: https://issues.apache.org/jira/browse/YARN-3344 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jon Bringhurst Attachments: YARN-3344-branch-trunk.001.patch, YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch Although this doesn't appear to be causing any functional issues, it is spamming our log files quite a bit. :) It appears that the regex in ProcfsBasedProcessTree doesn't work for all /proc/pid/stat files. Here's the error I'm seeing: {noformat} source_host: asdf, method: constructProcessInfo, level: WARN, message: Unexpected: procfs stat file is not in the expected format for process with pid 6953 file: ProcfsBasedProcessTree.java, line_number: 514, class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree, {noformat} And here's the basic info on process with pid 6953: {noformat} [asdf ~]$ cat /proc/6953/stat 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 2 18446744073709551615 0 0 17 13 0 0 0 0 0 [asdf ~]$ ps aux|grep 6953 root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 /export/apps/salt/minion-scripts/module-sync.py jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 [asdf ~]$ {noformat} This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Kumar Naik updated YARN-3344: -- Attachment: (was: YARN-3344-branch-2.6.0.003.patch) procfs stat file is not in the expected format warning -- Key: YARN-3344 URL: https://issues.apache.org/jira/browse/YARN-3344 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jon Bringhurst Attachments: YARN-3344-branch-trunk.001.patch, YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch Although this doesn't appear to be causing any functional issues, it is spamming our log files quite a bit. :) It appears that the regex in ProcfsBasedProcessTree doesn't work for all /proc/pid/stat files. Here's the error I'm seeing: {noformat} source_host: asdf, method: constructProcessInfo, level: WARN, message: Unexpected: procfs stat file is not in the expected format for process with pid 6953 file: ProcfsBasedProcessTree.java, line_number: 514, class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree, {noformat} And here's the basic info on process with pid 6953: {noformat} [asdf ~]$ cat /proc/6953/stat 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 2 18446744073709551615 0 0 17 13 0 0 0 0 0 [asdf ~]$ ps aux|grep 6953 root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 /export/apps/salt/minion-scripts/module-sync.py jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 [asdf ~]$ {noformat} This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Kumar Naik updated YARN-3344: -- Attachment: (was: YARN-3344-branch-2.6.0.002.patch) procfs stat file is not in the expected format warning -- Key: YARN-3344 URL: https://issues.apache.org/jira/browse/YARN-3344 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Jon Bringhurst Attachments: YARN-3344-branch-trunk.001.patch, YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch Although this doesn't appear to be causing any functional issues, it is spamming our log files quite a bit. :) It appears that the regex in ProcfsBasedProcessTree doesn't work for all /proc/pid/stat files. Here's the error I'm seeing: {noformat} source_host: asdf, method: constructProcessInfo, level: WARN, message: Unexpected: procfs stat file is not in the expected format for process with pid 6953 file: ProcfsBasedProcessTree.java, line_number: 514, class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree, {noformat} And here's the basic info on process with pid 6953: {noformat} [asdf ~]$ cat /proc/6953/stat 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 2 18446744073709551615 0 0 17 13 0 0 0 0 0 [asdf ~]$ ps aux|grep 6953 root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 /export/apps/salt/minion-scripts/module-sync.py jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 [asdf ~]$ {noformat} This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3273) Improve web UI to facilitate scheduling analysis and debugging
[ https://issues.apache.org/jira/browse/YARN-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14364865#comment-14364865 ] Rohith commented on YARN-3273: -- All test failures are because of BindException which is unrelated to this patch. May need to re kick off jenkin Improve web UI to facilitate scheduling analysis and debugging -- Key: YARN-3273 URL: https://issues.apache.org/jira/browse/YARN-3273 Project: Hadoop YARN Issue Type: Improvement Reporter: Jian He Assignee: Rohith Attachments: 0001-YARN-3273-v1.patch, 0001-YARN-3273-v2.patch, 0002-YARN-3273.patch, 0003-YARN-3273.patch, 0003-YARN-3273.patch, YARN-3273-am-resource-used-AND-User-limit-v2.PNG, YARN-3273-am-resource-used-AND-User-limit.PNG, YARN-3273-application-headroom-v2.PNG, YARN-3273-application-headroom.PNG Job may be stuck for reasons such as: - hitting queue capacity - hitting user-limit, - hitting AM-resource-percentage The first queueCapacity is already shown on the UI. We may surface things like: - what is user's current usage and user-limit; - what is the AM resource usage and limit; - what is the application's current HeadRoom; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration
[ https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365604#comment-14365604 ] Sunil G commented on YARN-3136: --- Thank you [~jlowe] for pointing out. I will fix and upload a new patch. getTransferredContainers can be a bottleneck during AM registration --- Key: YARN-3136 URL: https://issues.apache.org/jira/browse/YARN-3136 Project: Hadoop YARN Issue Type: Sub-task Components: scheduler Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Sunil G Attachments: 0001-YARN-3136.patch, 0002-YARN-3136.patch, 0003-YARN-3136.patch, 0004-YARN-3136.patch, 0005-YARN-3136.patch, 0006-YARN-3136.patch While examining RM stack traces on a busy cluster I noticed a pattern of AMs stuck waiting for the scheduler lock trying to call getTransferredContainers. The scheduler lock is highly contended, especially on a large cluster with many nodes heartbeating, and it would be nice if we could find a way to eliminate the need to grab this lock during this call. We've already done similar work during AM allocate calls to make sure they don't needlessly grab the scheduler lock, and it would be good to do so here as well, if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository
[ https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365626#comment-14365626 ] Ravindra Kumar Naik commented on YARN-3339: --- Thanks [~raviprakash] for reviewing and comitting. TestDockerContainerExecutor should pull a single image and not the entire centos repository --- Key: YARN-3339 URL: https://issues.apache.org/jira/browse/YARN-3339 Project: Hadoop YARN Issue Type: Test Components: test Affects Versions: 2.6.0 Environment: Linux Reporter: Ravindra Kumar Naik Priority: Minor Fix For: 2.8.0 Attachments: YARN-3339-branch-2.6.0.001.patch, YARN-3339-trunk.001.patch TestDockerContainerExecutor test pulls the entire centos repository which is time consuming. Pulling a specific image (e.g. centos7) will be sufficient to run the test successfully and will save time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3360) Add JMX metrics to TimelineDataManager
Jason Lowe created YARN-3360: Summary: Add JMX metrics to TimelineDataManager Key: YARN-3360 URL: https://issues.apache.org/jira/browse/YARN-3360 Project: Hadoop YARN Issue Type: Improvement Components: timelineserver Affects Versions: 2.6.0 Reporter: Jason Lowe The TimelineDataManager currently has no metrics, outside of the standard JVM metrics. It would be very useful to at least log basic counts of method calls, time spent in those calls, and number of entities/events involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3110) Few issues in ApplicationHistory web ui
[ https://issues.apache.org/jira/browse/YARN-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365736#comment-14365736 ] Naganarasimha G R commented on YARN-3110: - Hi [~zjshen] [~xgong], Can you one of you please review this jira. If fine will add some test cases for 1st and the 2nd issues listed above ... Few issues in ApplicationHistory web ui --- Key: YARN-3110 URL: https://issues.apache.org/jira/browse/YARN-3110 Project: Hadoop YARN Issue Type: Sub-task Components: applications, timelineserver Affects Versions: 2.6.0 Reporter: Bibin A Chundatt Assignee: Naganarasimha G R Priority: Minor Attachments: YARN-3110.20150209-1.patch, YARN-3110.20150315-1.patch Application state and History link wrong when Application is in unassigned state 1.Configure capacity schedular with queue size as 1 also max Absolute Max Capacity: 10.0% (Current application state is Accepted and Unassigned from resource manager side) 2.Submit application to queue and check the state and link in Application history State= null and History link shown as N/A in applicationhistory page Kill the same application . In timeline server logs the below is show when selecting application link. {quote} 2015-01-29 15:39:50,956 ERROR org.apache.hadoop.yarn.webapp.View: Failed to read the AM container of the application attempt appattempt_1422467063659_0007_01. java.lang.NullPointerException at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainer(ApplicationHistoryManagerOnTimelineStore.java:162) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAMContainer(ApplicationHistoryManagerOnTimelineStore.java:184) at org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:160) at org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:157) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:156) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:56) at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212) at org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.AHSController.app(AHSController.java:38) at sun.reflect.GeneratedMethodAccessor63.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at
[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.
[ https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3356: - Attachment: YARN-3356.1.patch Attached ver.1 patch. Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. -- Key: YARN-3356 URL: https://issues.apache.org/jira/browse/YARN-3356 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-3356.1.patch Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (YARN-3339) TestDockerContainerExecutor should pull a single image and not the entire centos repository
[ https://issues.apache.org/jira/browse/YARN-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash closed YARN-3339. -- TestDockerContainerExecutor should pull a single image and not the entire centos repository --- Key: YARN-3339 URL: https://issues.apache.org/jira/browse/YARN-3339 Project: Hadoop YARN Issue Type: Test Components: test Affects Versions: 2.6.0 Environment: Linux Reporter: Ravindra Kumar Naik Priority: Minor Fix For: 2.8.0 Attachments: YARN-3339-branch-2.6.0.001.patch, YARN-3339-trunk.001.patch TestDockerContainerExecutor test pulls the entire centos repository which is time consuming. Pulling a specific image (e.g. centos7) will be sufficient to run the test successfully and will save time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3204) Fix new findbug warnings in hadoop-yarn-server-resourcemanager(resourcemanager.scheduler.fair)
[ https://issues.apache.org/jira/browse/YARN-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365821#comment-14365821 ] zhihai xu commented on YARN-3204: - Some comments: 1. about the comment for Inconsistent sync warning fsOpDurations {code} Inconsistent sync warning - callDurationMetrics is only initialized once and never changed {code} It looks like not accurate. Each method from fsOpDurations is only called in one thread, all these methods access different fields and are independent. 2.Can we define reloadListener as volatile? since reloadListener is accessed by two threads, it will be safer to use volatile. 3.Can we move the check to the beginning of the functions of reserveResource? It will be better to check error earlier than later to avoid unnecessary operations. {code} if (!(application instanceof FSAppAttempt)) { {code} Can we use YarnRuntimeException instead of IllegalArgumentException? This looks like an unexpected runtime exception. 4.adding lock for getAllocationConfiguration is dangerous. A lot of codes(Queue, FairReservationSystem... ) are calling getAllocationConfiguration, which can introduce potential deadlock situation and performance issue. For example, QueueManager#getQueue lock queues then call QueueManager#createQueue then call scheduler.getAllocationConfiguration. This will have two layer locks if we add lock in getAllocationConfiguration. Can we define allocConf as volatile? since allocConf will only be updated by AllocationReloadListener.onReload which is called from AllocationFileLoaderService#reloadThread after initialization. Fix new findbug warnings in hadoop-yarn-server-resourcemanager(resourcemanager.scheduler.fair) -- Key: YARN-3204 URL: https://issues.apache.org/jira/browse/YARN-3204 Project: Hadoop YARN Issue Type: Bug Reporter: Brahma Reddy Battula Assignee: Brahma Reddy Battula Priority: Blocker Attachments: YARN-3204-001.patch, YARN-3204-002.patch, YARN-3204-003.patch Please check following findbug report.. https://builds.apache.org/job/PreCommit-YARN-Build/6644//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.
[ https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365601#comment-14365601 ] Hudson commented on YARN-3243: -- FAILURE: Integrated in Hadoop-trunk-Commit #7349 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7349/]) YARN-3243. CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits. Contributed by Wangda Tan. (jianhe: rev 487374b7fe0c92fc7eb1406c568952722b5d5b15) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestLeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestReservations.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestParentQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits. - Key: YARN-3243 URL: https://issues.apache.org/jira/browse/YARN-3243 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Fix For: 2.8.0 Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, YARN-3243.4.patch, YARN-3243.5.patch Now CapacityScheduler has some issues to make sure ParentQueue always obeys its capacity limits, for example: 1) When allocating container of a parent queue, it will only check parentQueue.usage parentQueue.max. If leaf queue allocated a container.size (parentQueue.max - parentQueue.usage), parent queue can excess its max resource limit, as following example: {code} A (usage=54, max=55) / \ A1 A2 (usage=1, max=55) (usage=53, max=53) {code} Queue-A2 is able to allocate container since its usage max, but if we do that, A's usage can excess A.max. 2) When doing continous reservation check, parent queue will only tell children you need unreserve *some* resource, so that I will less than my maximum resource, but it will not tell how many resource need to be unreserved. This may lead to parent queue excesses configured maximum capacity as well. With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, *here is my proposal*: - ParentQueue will set its children's ResourceUsage.headroom, which means, *maximum resource its children can allocate*. - ParentQueue will set its children's headroom to be (saying parent's name is qA): min(qA.headroom, qA.max - qA.used). This will make sure qA's ancestors' capacity will be enforced as well (qA.headroom is set by qA's parent). - {{needToUnReserve}} is not necessary, instead, children can get how much resource need
[jira] [Commented] (YARN-3341) Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource
[ https://issues.apache.org/jira/browse/YARN-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365698#comment-14365698 ] zhihai xu commented on YARN-3341: - Sorry, I missed YARN-3204, I resolved this issue as duplicate. I will review the patch at YARN-3204. [~brahmareddy], thanks to point this out. Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource --- Key: YARN-3341 URL: https://issues.apache.org/jira/browse/YARN-3341 Project: Hadoop YARN Issue Type: Improvement Reporter: zhihai xu Assignee: zhihai xu Priority: Minor Labels: findbugs Attachments: YARN-3341.000.patch, YARN-3341.001.patch Fix findbugs warning:BC_UNCONFIRMED_CAST at FSSchedulerNode.reserveResource The warning message is {code} Unchecked/unconfirmed cast from org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt in org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.reserveResource(SchedulerApplicationAttempt, Priority, RMContainer) {code} The code which cause the warning is {code} this.reservedAppSchedulable = (FSAppAttempt) application; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.
[ https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3356: - Description: Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. was:Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. -- Key: YARN-3356 URL: https://issues.apache.org/jira/browse/YARN-3356 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3111) Fix ratio problem on FairScheduler page
[ https://issues.apache.org/jira/browse/YARN-3111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365722#comment-14365722 ] Ashwin Shankar commented on YARN-3111: -- bq. What do you guys think of overlaying CPU and memory usage, the way steady and instantaneous fairshares are laid out today? I agree with Peng, it is going to become pretty cluttered displaying shares/max of each of the resources on the same bar. Also, if we go by this approach, it would be ambiguous as to which resource is the usage bar representing, since the usage bar shows usage of dominant resource ie max(memoryRatio, vCoresRatio). Usage bar turns orange when its above fairshare, if we represent all the resources in one bar, how do we know if we are above fair share due to memory or cpu or disk ? Here is my proposal : For each queue bar : 1. represent steady/instant/max of only the dominant resource in the bar. 2. usage, like in the patch, will be again usage of dominant resource. 3. In the tooltip, we mention what is the dominant resource we are representing in that queue([memory,cpu]). Note that dominant resource displayed can be memory in one queue and something else in another. 4. We already display steady/instant memory percentage in the tool tip, we could add cpu % of steady/instant/max in there as well, so that we know details of each of the resource. Thoughts ? Fix ratio problem on FairScheduler page --- Key: YARN-3111 URL: https://issues.apache.org/jira/browse/YARN-3111 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Peng Zhang Priority: Minor Attachments: YARN-3111.1.patch, YARN-3111.png Found 3 problems on FairScheduler page: 1. Only compute memory for ratio even when queue schedulingPolicy is DRF. 2. When min resources is configured larger than real resources, the steady fair share ratio is so long that it is out the page. 3. When cluster resources is 0(no nodemanager start), ratio is displayed as NaN% used Attached image shows the snapshot of above problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3293) Track and display capacity scheduler health metrics in web UI
[ https://issues.apache.org/jira/browse/YARN-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365765#comment-14365765 ] Vinod Kumar Vavilapalli commented on YARN-3293: --- +1 for doing this. We do not need to be specific to any scheduler. Had an offline discussion with [~vvasudev] and [~venkateshrin]. Here are a few things worth highlighting - Last Scheduling Cycle: Timestamp, Allocated Resource (MB, VCores), Reserved Resource, Released Resource - Number of Allocations so far - Last Allocation: Timestamp, Node, Container, Queue - Number of Resource releases so far - Last Resource Release: Timestamp, Container, Queue - Number of Preemptions so far - Last Preemption: Timestamp, Container, Queue - Number of fulfilled reservations so far. Fulfilled reservations do not include our internal (unreserve + reserve) operations - Last Reservation: Timestamp, Node, Container, Queue - Others: Maximum current node size (MB, Cores) in the cluster - Configuration: Cluster level Minimum Allocation MB, Maximum Allocation MB, similar values for cores Track and display capacity scheduler health metrics in web UI - Key: YARN-3293 URL: https://issues.apache.org/jira/browse/YARN-3293 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler Reporter: Varun Vasudev Assignee: Varun Vasudev It would be good to display metrics that let users know about the health of the capacity scheduler in the web UI. Today it is hard to get an idea if the capacity scheduler is functioning correctly. Metrics such as the time for the last allocation, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3039: - Attachment: YARN-3039-v8.patch Incorporate [~zjshen]'s comments in v8 patch. For TestRPC, lets keep it there given it works fine in yarn-server-common. Also, verify that end-to-end test for TestDistributedShell get passed. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.
[ https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi Ozawa updated YARN-3205: - Affects Version/s: 2.7.0 FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. --- Key: YARN-3205 URL: https://issues.apache.org/jira/browse/YARN-3205 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: YARN-3205.000.patch, YARN-3205.001.patch FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. The old configuration may not have all these customized DFS_CLIENT configurations for FileSystemRMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.
[ https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi Ozawa updated YARN-3205: - Fix Version/s: 2.8.0 FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. --- Key: YARN-3205 URL: https://issues.apache.org/jira/browse/YARN-3205 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: YARN-3205.000.patch, YARN-3205.001.patch FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. The old configuration may not have all these customized DFS_CLIENT configurations for FileSystemRMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.
[ https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3356: - Attachment: YARN-3356.2.patch Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. -- Key: YARN-3356 URL: https://issues.apache.org/jira/browse/YARN-3356 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-3356.1.patch, YARN-3356.2.patch Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.
[ https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366438#comment-14366438 ] zhihai xu commented on YARN-3205: - [~ozawa], thanks for the review. That is a very good suggestion. I uploaded a new patch YARN-3205.001.patch which addressed your comment. The only difference is I didn't call stop because stop will call closeInternal to close the fs, which will make the test pass without the change. FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. --- Key: YARN-3205 URL: https://issues.apache.org/jira/browse/YARN-3205 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3205.000.patch, YARN-3205.001.patch FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. The old configuration may not have all these customized DFS_CLIENT configurations for FileSystemRMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.
[ https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366516#comment-14366516 ] Hadoop QA commented on YARN-3205: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705236/YARN-3205.001.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7010//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7010//console This message is automatically generated. FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. --- Key: YARN-3205 URL: https://issues.apache.org/jira/browse/YARN-3205 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3205.000.patch, YARN-3205.001.patch FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. The old configuration may not have all these customized DFS_CLIENT configurations for FileSystemRMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.
[ https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366542#comment-14366542 ] Tsuyoshi Ozawa commented on YARN-3205: -- +1, committing this shortly. FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. --- Key: YARN-3205 URL: https://issues.apache.org/jira/browse/YARN-3205 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-3205.000.patch, YARN-3205.001.patch FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. The old configuration may not have all these customized DFS_CLIENT configurations for FileSystemRMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-3039: -- Attachment: YARN-3039.9.patch Thanks for addressing the comments. I made some minor touch on the patch to fix some method signatures, and make old put method still use retryfilter. I'll commit the patch a bit later in case [~sjlee0] wants to take a look at the patch too. [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.
[ https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366553#comment-14366553 ] Hudson commented on YARN-3205: -- SUCCESS: Integrated in Hadoop-trunk-Commit #7354 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7354/]) YARN-3205. FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. Contributed by Zhihai Xu. (ozawa: rev 3bc72cc16d8c7b8addd8f565523001dfcc32b891) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java * hadoop-yarn-project/CHANGES.txt FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. --- Key: YARN-3205 URL: https://issues.apache.org/jira/browse/YARN-3205 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: YARN-3205.000.patch, YARN-3205.001.patch FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. The old configuration may not have all these customized DFS_CLIENT configurations for FileSystemRMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3205) FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration.
[ https://issues.apache.org/jira/browse/YARN-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366554#comment-14366554 ] zhihai xu commented on YARN-3205: - Thanks [~ozawa] for valuable feedback and committing the patch! FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. --- Key: YARN-3205 URL: https://issues.apache.org/jira/browse/YARN-3205 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Fix For: 2.8.0 Attachments: YARN-3205.000.patch, YARN-3205.001.patch FileSystemRMStateStore should disable FileSystem Cache to avoid get a Filesystem with an old configuration. The old configuration may not have all these customized DFS_CLIENT configurations for FileSystemRMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3181) FairScheduler: Fix up outdated findbugs issues
[ https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366560#comment-14366560 ] Brahma Reddy Battula commented on YARN-3181: I will take up this issue, since I worked on YARN-3204... FairScheduler: Fix up outdated findbugs issues -- Key: YARN-3181 URL: https://issues.apache.org/jira/browse/YARN-3181 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-3181-1.patch In FairScheduler, we have excluded some findbugs-reported errors. Some of them aren't applicable anymore, and there are a few that can be easily fixed without needing an exclusion. It would be nice to fix them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3181) FairScheduler: Fix up outdated findbugs issues
[ https://issues.apache.org/jira/browse/YARN-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula reassigned YARN-3181: -- Assignee: Brahma Reddy Battula (was: Karthik Kambatla) FairScheduler: Fix up outdated findbugs issues -- Key: YARN-3181 URL: https://issues.apache.org/jira/browse/YARN-3181 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.6.0 Reporter: Karthik Kambatla Assignee: Brahma Reddy Battula Attachments: yarn-3181-1.patch In FairScheduler, we have excluded some findbugs-reported errors. Some of them aren't applicable anymore, and there are a few that can be easily fixed without needing an exclusion. It would be nice to fix them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3356) Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label.
[ https://issues.apache.org/jira/browse/YARN-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366507#comment-14366507 ] Hadoop QA commented on YARN-3356: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705234/YARN-3356.2.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 6 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7009//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/7009//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7009//console This message is automatically generated. Capacity Scheduler FiCaSchedulerApp should use ResourceUsage to track used-resources-by-label. -- Key: YARN-3356 URL: https://issues.apache.org/jira/browse/YARN-3356 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-3356.1.patch, YARN-3356.2.patch Simliar to YARN-3099, Capacity Scheduler's LeafQueue.User/FiCaSchedulerApp should use ResourceRequest to track resource-usage/pending by label for better resource tracking and preemption. And also, when application's pending resource changed (container allocated, app completed, moved, etc.), we need update ResourceUsage of queue hierarchies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3047) [Data Serving] Set up ATS reader with basic request serving structure and lifecycle
[ https://issues.apache.org/jira/browse/YARN-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366514#comment-14366514 ] Zhijie Shen commented on YARN-3047: --- Some comments about the patch: 1. No need to change {{timeline/TimelineEvents.java}}. 2. In YarnConfiguration, how about we still reusing the existing timeline service config. I propose config reuse because there doesn't exist the use case that we start old timeline server and the new timeline reader server together. And change in WebAppUtils should be not necessary too. 3. NameValuePair is for internal usage only. Let's keep it in the timeline service module? 4. Rename TimelineReaderStore to TimelineReader. I think we don't need to have NullTimelineReader. Instead, we should have a POC implementation based on local FS like FileSystemTimelineWriterImpl. But we can defer this work in a separate jira if the implementation is not straightforward. 5. TimelineReaderServer - TimelineWebServer? For startTimelineReaderWebApp, can we do something similar to TimelineAggregatorsCollection#startWebApp. 6. Add the command in yarn and yarn.cmd to start the server. [Data Serving] Set up ATS reader with basic request serving structure and lifecycle --- Key: YARN-3047 URL: https://issues.apache.org/jira/browse/YARN-3047 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Varun Saxena Attachments: YARN-3047.001.patch, YARN-3047.02.patch Per design in YARN-2938, set up the ATS reader as a service and implement the basic structure as a service. It includes lifecycle management, request serving, and so on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3039) [Aggregator wireup] Implement ATS app-appgregator service discovery
[ https://issues.apache.org/jira/browse/YARN-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366558#comment-14366558 ] Sangjin Lee commented on YARN-3039: --- I took a look at patch v.8. LGTM. Thanks much for your work [~djp]! [Aggregator wireup] Implement ATS app-appgregator service discovery --- Key: YARN-3039 URL: https://issues.apache.org/jira/browse/YARN-3039 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Junping Du Attachments: Service Binding for applicationaggregator of ATS (draft).pdf, Service Discovery For Application Aggregator of ATS (v2).pdf, YARN-3039-no-test.patch, YARN-3039-v2-incomplete.patch, YARN-3039-v3-core-changes-only.patch, YARN-3039-v4.patch, YARN-3039-v5.patch, YARN-3039-v6.patch, YARN-3039-v7.patch, YARN-3039-v8.patch, YARN-3039.9.patch Per design in YARN-2928, implement ATS writer service discovery. This is essential for off-node clients to send writes to the right ATS writer. This should also handle the case of AM failures. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
[ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366562#comment-14366562 ] Naganarasimha G R commented on YARN-2495: - 1. Well i think i am missing something or you have miss-read my patch. * in both request Protos i have used NodeIdToLabelsProto {{optional NodeIdToLabelsProto nodeLabels = 8;}} and {{optional NodeIdToLabelsProto nodeLabels = 4;}} and not used {{repeat string nodeLabels}} directly * Test cases TestYarnServerApiClasses.testNodeHeartbeatRequestPBImpl, testNodeHeartbeatRequestPBImplWithNullLabels, testRegisterNodeManagerRequestWithNullLabels and testRegisterNodeManagerRequestWithValidLabels validates that the approach in the patch supports null and filled labelsset, manually tested for empty set and as expected that too worked. will get this test case added in next patch * thought of creating a new proto like {{StringSetProto}} but felt why create another proto class just for this purpose and you too had mentioned to use {{NodeIdToLabelsProto}} hence made use of existing proto class 2. {{Typo, lable - label,}} : oops because of typo in proto, generated methods also had issues hence proto and places accessing these methods(6 instances) have this error. Will get it corrected in next patch. 3. ??optional bool areNodeLablesAcceptedByRM = 7 \[default = false\], I think default should be true.?? Personally i felt it should not matter as I am explicitly handling in the code. But consider the case where NM gets upgraded first then it should not be the case NM sends labels and older RM ignores the additional labels but response by default sends labels are accepted. And also felt by name/functionality, it should be set to true only after RM accepts the labels 4,5,6 --will get it corrected as part of next patch. Also one favor, it would be helpful if you review the test-cases and give feed back on them too, as it will reduce my effort in creating multiple patches. I understand that its huge size patch but anyway i feel major aspects/functionality seems to be stable with the last patch. Allow admin specify labels from each NM (Distributed configuration) --- Key: YARN-2495 URL: https://issues.apache.org/jira/browse/YARN-2495 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Naganarasimha G R Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, YARN-2495.20150318-1.patch, YARN-2495_20141022.1.patch Target of this JIRA is to allow admin specify labels in each NM, this covers - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or using script suggested by [~aw] (YARN-2729) ) - NM will send labels to RM via ResourceTracker API - RM will set labels in NodeLabelManager when NM register/update labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
[ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366312#comment-14366312 ] Wangda Tan commented on YARN-2495: -- Thanks for updating. 1. I saw you're trying to make {{repeat string nodeLabels}} to be 3 different values: null, empty, and non-empty. But I'm not sure if this works in PB, could you write a test to verify that? (Not-set/Set-nothing/Set-value to RegisterNodeManagerRequestPBImpl.nodeLabels, create a new RegisterNodeManagerRequestPBImpl from old.getProto(), and get nodeLabels() to see if it works). If this doesn't work, you can create a PB message like StringSetProto use it in messages like RegisterNodeManagerRequest, which can support null/empty/non-empty 2. Typo, lable - label, I found several in your patch. 3. optional bool areNodeLablesAcceptedByRM = 7 \[default = false\], I think default should be true. 4. NodeStatusUpdaterImpl: no need to call nodeLabelsProvider.getNodeLabels() twice when register/heartbeat 5. HeartBeat - Heartbeat 6. NodeStatusUpdaterImpl: When labels are rejected by RM, you should log it with diag message. Will include test review in next round. Allow admin specify labels from each NM (Distributed configuration) --- Key: YARN-2495 URL: https://issues.apache.org/jira/browse/YARN-2495 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Naganarasimha G R Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, YARN-2495.20150318-1.patch, YARN-2495_20141022.1.patch Target of this JIRA is to allow admin specify labels in each NM, this covers - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or using script suggested by [~aw] (YARN-2729) ) - NM will send labels to RM via ResourceTracker API - RM will set labels in NodeLabelManager when NM register/update labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366357#comment-14366357 ] Hadoop QA commented on YARN-2556: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705194/YARN-2556.2.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7005//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7005//console This message is automatically generated. Tool to measure the performance of the timeline server -- Key: YARN-2556 URL: https://issues.apache.org/jira/browse/YARN-2556 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Chang Li Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.1.patch, YARN-2556.2.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity. I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start. This could be done as an example or test job that could be tied into gridmix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3360) Add JMX metrics to TimelineDataManager
[ https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366288#comment-14366288 ] Hadoop QA commented on YARN-3360: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705199/YARN-3360.001.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7008//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7008//console This message is automatically generated. Add JMX metrics to TimelineDataManager -- Key: YARN-3360 URL: https://issues.apache.org/jira/browse/YARN-3360 Project: Hadoop YARN Issue Type: Improvement Components: timelineserver Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: YARN-3360.001.patch The TimelineDataManager currently has no metrics, outside of the standard JVM metrics. It would be very useful to at least log basic counts of method calls, time spent in those calls, and number of entities/events involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue
[ https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366194#comment-14366194 ] Wangda Tan commented on YARN-1963: -- bq. I feel we can make such label config in a common place which can be accessible for any schedulers. Agree, this should be a part of YARN configuration. I put it as a part of queue config just for readability for the proposal :). Support priorities across applications within the same queue - Key: YARN-1963 URL: https://issues.apache.org/jira/browse/YARN-1963 Project: Hadoop YARN Issue Type: New Feature Components: api, resourcemanager Reporter: Arun C Murthy Assignee: Sunil G Attachments: 0001-YARN-1963-prototype.patch, YARN Application Priorities Design.pdf, YARN Application Priorities Design_01.pdf It will be very useful to support priorities among applications within the same queue, particularly in production scenarios. It allows for finer-grained controls without having to force admins to create a multitude of queues, plus allows existing applications to continue using existing queues which are usually part of institutional memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3040) [Data Model] Implement client-side API for handling flows
[ https://issues.apache.org/jira/browse/YARN-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366338#comment-14366338 ] Zhijie Shen commented on YARN-3040: --- [~rkanter], would you mind my taking over this jira to move it forward? [Data Model] Implement client-side API for handling flows - Key: YARN-3040 URL: https://issues.apache.org/jira/browse/YARN-3040 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Robert Kanter Per design in YARN-2928, implement client-side API for handling *flows*. Frameworks should be able to define and pass in all attributes of flows and flow runs to YARN, and they should be passed into ATS writers. YARN tags were discussed as a way to handle this piece of information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2003) Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side]
[ https://issues.apache.org/jira/browse/YARN-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366273#comment-14366273 ] Wangda Tan commented on YARN-2003: -- Hi [~sunilg], Thanks for working on this. I took a quick look at YARN-2003 / YARN-2004. Some overall comments. First I want to describe my thinkings of how RM side workflow look like: - When application submit to RMAppManager, it will simply pass priority set by user to RMApp (see (1)) - RMApp will finally create APP_ATTEMPT_ADDED, queue itself will normalize priority (reject it / convert it from label to number, etc. And set new priority to RMApp), set priority to SchedulerApplicationAttempt - Scheduler uses priority in SchedulerApplicationAttempt and Queue to make scheduling decisions - If user ask for changing application priority, or admin changes priority configuration, event may need send to scheduler to update inner applications/queues. - When user requires priority of application via CLI/Web-UI, ApplicationPriorityManager will convert number to label (if possible) and show to user. (1). We don't have to add too much logic here, and if we can simply handle it inside scheduler, when configuration changed (like label-integer priority mapping changed), it can be handled by scheduler itself. Back to your patches, several major differences are: - RMAppManager takes responsibility to check ACL for priority, which I think not proper (ACL is always managed by scheduler queues) and not correct (when queue configuration changed, you cannot do rechecking application priority if new app attempt created.). - As same as above, queues should check ACLs for priority. - You may not need add priority to SchedulerApplication, it seems not necessary to me. Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side] -- Key: YARN-2003 URL: https://issues.apache.org/jira/browse/YARN-2003 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2003.patch, 0002-YARN-2003.patch, 0003-YARN-2003.patch, 0004-YARN-2003.patch AppAttemptAddedSchedulerEvent should be able to receive the Job Priority from Submission Context and store. Later this can be used by Scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2003) Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side]
[ https://issues.apache.org/jira/browse/YARN-2003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366282#comment-14366282 ] Wangda Tan commented on YARN-2003: -- In beyond, I suggest you can divide YARN-2003/YARN-2004 to be: YARN-2003: - Changes of RM side - Changes of scheduler Queue interface (May need some empty implementations in specific schedulers). - New scheduler event YARN-2003 should be able to compile/test without YARN-2004 YARN-2004 tracks changes only in CapacityScheduler side and need YARN-2004 to compile/test. Which this, I can simply apply two patches to do review. Support to process Job priority from Submission Context in AppAttemptAddedSchedulerEvent [RM side] -- Key: YARN-2003 URL: https://issues.apache.org/jira/browse/YARN-2003 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2003.patch, 0002-YARN-2003.patch, 0003-YARN-2003.patch, 0004-YARN-2003.patch AppAttemptAddedSchedulerEvent should be able to receive the Job Priority from Submission Context and store. Later this can be used by Scheduler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3034) [Aggregator wireup] Implement RM starting its ATS writer
[ https://issues.apache.org/jira/browse/YARN-3034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366202#comment-14366202 ] Hadoop QA commented on YARN-3034: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705190/YARN-3034.20150318-1.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7006//console This message is automatically generated. [Aggregator wireup] Implement RM starting its ATS writer Key: YARN-3034 URL: https://issues.apache.org/jira/browse/YARN-3034 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3034-20150312-1.patch, YARN-3034.20150205-1.patch, YARN-3034.20150316-1.patch, YARN-3034.20150318-1.patch Per design in YARN-2928, implement resource managers starting their own ATS writers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2693) Priority Label Manager in RM to manage application priority based on configuration
[ https://issues.apache.org/jira/browse/YARN-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366225#comment-14366225 ] Wangda Tan commented on YARN-2693: -- Some overall suggestions: 1) Instead of ApplicationPriorityPerQueue, queue's priority related fields could store in scheduler.Queue directly (methods of scheduler.Queue, implementation for different scheduler could be various. Since we have different scheduler configuration). Benefits of doing this: - All other queue-specifc configurations are in schedulers' own configuration files, make application-priority fields for queue storing out-of queue means you have to sync it with queue's configuration when you do refreshQueues, etc. - Put it in scheduler.Queue can make scheduler changes easier (don't have to access ApplicationPriorityManager). 2) Methods of ApplicationPriorityManager: - Since we're discussing how to configure priority, I will review ApplicationPriorityManager implementations once we close design. - ClusterPriorities should be a range (If we start from zero, a maxPriority will be enough) - getApplicationPriorityFromQueue should not exist, all queue related methods should be in scheduler.Queue - isPriorityExistsInCluster maybe not existed, it should be something like accepted. - Can be reinitialize - Can convert between number/label Priority Label Manager in RM to manage application priority based on configuration -- Key: YARN-2693 URL: https://issues.apache.org/jira/browse/YARN-2693 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Sunil G Assignee: Sunil G Attachments: 0001-YARN-2693.patch, 0002-YARN-2693.patch, 0003-YARN-2693.patch, 0004-YARN-2693.patch, 0005-YARN-2693.patch, 0006-YARN-2693.patch Focus of this JIRA is to have a centralized service to handle priority labels. Support operations such as * Add/Delete priority label to a specified queue * Manage integer mapping associated with each priority label * Support managing default priority label of a given queue * Expose interface to RM to validate priority label TO have simplified interface, Priority Manager will support only configuration file in contrast with admin cli and REST. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2495) Allow admin specify labels from each NM (Distributed configuration)
[ https://issues.apache.org/jira/browse/YARN-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366236#comment-14366236 ] Hadoop QA commented on YARN-2495: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12705179/YARN-2495.20150318-1.patch against trunk revision 968425e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/7003//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/7003//console This message is automatically generated. Allow admin specify labels from each NM (Distributed configuration) --- Key: YARN-2495 URL: https://issues.apache.org/jira/browse/YARN-2495 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Naganarasimha G R Attachments: YARN-2495.20141023-1.patch, YARN-2495.20141024-1.patch, YARN-2495.20141030-1.patch, YARN-2495.20141031-1.patch, YARN-2495.20141119-1.patch, YARN-2495.20141126-1.patch, YARN-2495.20141204-1.patch, YARN-2495.20141208-1.patch, YARN-2495.20150305-1.patch, YARN-2495.20150309-1.patch, YARN-2495.20150318-1.patch, YARN-2495_20141022.1.patch Target of this JIRA is to allow admin specify labels in each NM, this covers - User can set labels in each NM (by setting yarn-site.xml (YARN-2923) or using script suggested by [~aw] (YARN-2729) ) - NM will send labels to RM via ResourceTracker API - RM will set labels in NodeLabelManager when NM register/update labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.
zhihai xu created YARN-3363: --- Summary: add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container. Key: YARN-3363 URL: https://issues.apache.org/jira/browse/YARN-3363 Project: Hadoop YARN Issue Type: Improvement Reporter: zhihai xu Assignee: zhihai xu add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container. Currently ContainerMetrics has container's actual memory usage(YARN-2984), actual CPU usage(YARN-3122), resource and pid(YARN-3022). It will be better to have localization and container launch time in ContainerMetrics for each active container. -- This message was sent by Atlassian JIRA (v6.3.4#6332)