[jira] [Created] (YARN-646) Some issues in Fair Scheduler's document
Dapeng Sun created YARN-646: --- Summary: Some issues in Fair Scheduler's document Key: YARN-646 URL: https://issues.apache.org/jira/browse/YARN-646 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.0.4-alpha Reporter: Dapeng Sun Fix For: 2.0.5-beta Issues are found in the doc page for Fair Scheduler http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html: 1.In the section “Configuration”, It contains two properties named “yarn.scheduler.fair.minimum-allocation-mb”, the second one should be “yarn.scheduler.fair.maximum-allocation-mb” 2.In the section “Allocation file format”, the document tells “ The format contains three types of elements”, but it lists four types of elements following that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-646) Some issues in Fair Scheduler's document
[ https://issues.apache.org/jira/browse/YARN-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated YARN-646: Attachment: YARN-646.patch Some issues in Fair Scheduler's document Key: YARN-646 URL: https://issues.apache.org/jira/browse/YARN-646 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.0.4-alpha Reporter: Dapeng Sun Fix For: 2.0.5-beta Attachments: YARN-646.patch Issues are found in the doc page for Fair Scheduler http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html: 1.In the section “Configuration”, It contains two properties named “yarn.scheduler.fair.minimum-allocation-mb”, the second one should be “yarn.scheduler.fair.maximum-allocation-mb” 2.In the section “Allocation file format”, the document tells “ The format contains three types of elements”, but it lists four types of elements following that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-646) Some issues in Fair Scheduler's document
[ https://issues.apache.org/jira/browse/YARN-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649609#comment-13649609 ] Dapeng Sun commented on YARN-646: - Uploaded a simple patch that fixes these. Some issues in Fair Scheduler's document Key: YARN-646 URL: https://issues.apache.org/jira/browse/YARN-646 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.0.4-alpha Reporter: Dapeng Sun Fix For: 2.0.5-beta Attachments: YARN-646.patch Issues are found in the doc page for Fair Scheduler http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html: 1.In the section “Configuration”, It contains two properties named “yarn.scheduler.fair.minimum-allocation-mb”, the second one should be “yarn.scheduler.fair.maximum-allocation-mb” 2.In the section “Allocation file format”, the document tells “ The format contains three types of elements”, but it lists four types of elements following that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-592) Container logs lost for the application when NM gets restarted
[ https://issues.apache.org/jira/browse/YARN-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-592: --- Attachment: YARN-592.patch Container logs lost for the application when NM gets restarted -- Key: YARN-592 URL: https://issues.apache.org/jira/browse/YARN-592 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.1-alpha, 2.0.3-alpha Reporter: Devaraj K Assignee: Devaraj K Priority: Critical Attachments: YARN-592.patch While running a big job if the NM goes down due to some reason and comes back, it will do the log aggregation for the newly launched containers and deletes all the containers for the application. This case we don't get the container logs from HDFS or local for the containers which are launched before restart and completed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-647) historyServer can show container's log when aggregation is not enabled
shenhong created YARN-647: - Summary: historyServer can show container's log when aggregation is not enabled Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.0.4-alpha, 0.23.7 Environment: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. Reporter: shenhong -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Description: Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. was: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. historyServer can show container's log when aggregation is not enabled -- Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Reporter: shenhong Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Description: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. historyServer can show container's log when aggregation is not enabled -- Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. Reporter: shenhong When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Description: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. was: Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. historyServer can show container's log when aggregation is not enabled -- Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Reporter: shenhong When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Environment: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 was: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. historyServer can show container's log when aggregation is not enabled -- Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Reporter: shenhong When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Environment: yarn.log-aggregation-enable=false , HistoryServer will show like this: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 was: yarn.log-aggregation-enable=false Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 historyServer can show container's log when aggregation is not enabled -- Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: yarn.log-aggregation-enable=false , HistoryServer will show like this: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Reporter: shenhong When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Environment: yarn.log-aggregation-enable=false Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 was: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 historyServer can show container's log when aggregation is not enabled -- Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: yarn.log-aggregation-enable=false Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Reporter: shenhong When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Summary: historyServer can't show container's log when aggregation is not enabled (was: historyServer can show container's log when aggregation is not enabled) historyServer can't show container's log when aggregation is not enabled Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: yarn.log-aggregation-enable=false , HistoryServer will show like this: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Reporter: shenhong When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Attachment: yarn-647.patch add a patch historyServer can't show container's log when aggregation is not enabled Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: yarn.log-aggregation-enable=false , HistoryServer will show like this: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Reporter: shenhong Attachments: yarn-647.patch When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-646) Some issues in Fair Scheduler's document
[ https://issues.apache.org/jira/browse/YARN-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dapeng Sun updated YARN-646: Attachment: YARN-646.patch Some issues in Fair Scheduler's document Key: YARN-646 URL: https://issues.apache.org/jira/browse/YARN-646 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.0.4-alpha Reporter: Dapeng Sun Fix For: 2.0.5-beta Attachments: YARN-646.patch, YARN-646.patch Issues are found in the doc page for Fair Scheduler http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html: 1.In the section “Configuration”, It contains two properties named “yarn.scheduler.fair.minimum-allocation-mb”, the second one should be “yarn.scheduler.fair.maximum-allocation-mb” 2.In the section “Allocation file format”, the document tells “ The format contains three types of elements”, but it lists four types of elements following that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-422) Add NM client library
[ https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649809#comment-13649809 ] Zhijie Shen commented on YARN-422: -- Like ContainerLauncher, I think NMClient is supposed to start when AM starts and stop when AM stops. IMHO, no matter how long a container keeps running, it should be stopped when the AM who starts it stops. In addition, NMClient would be the only gate to access NodeManager (I mean it's not good to use NMClient and the raw RPC proxy simultaneously), such that if NMClient doesn't stop the running containers when NMClient stops (or AM exits), these containers may be not stoppable. Semantically, it is not good that AM has already stopped, while the forked containers keep running. Add NM client library - Key: YARN-422 URL: https://issues.apache.org/jira/browse/YARN-422 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Zhijie Shen Attachments: AMNMClient_Defination.txt, AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, YARN-422.1.patch, YARN-422.2.patch, YARN-422.3.patch Create a simple wrapper over the ContainerManager protocol to provide hide the details of the protocol implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-646) Some issues in Fair Scheduler's document
[ https://issues.apache.org/jira/browse/YARN-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649849#comment-13649849 ] Sandy Ryza commented on YARN-646: - +1, thanks for fixing these Some issues in Fair Scheduler's document Key: YARN-646 URL: https://issues.apache.org/jira/browse/YARN-646 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.0.4-alpha Reporter: Dapeng Sun Fix For: 2.0.5-beta Attachments: YARN-646.patch, YARN-646.patch Issues are found in the doc page for Fair Scheduler http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html: 1.In the section “Configuration”, It contains two properties named “yarn.scheduler.fair.minimum-allocation-mb”, the second one should be “yarn.scheduler.fair.maximum-allocation-mb” 2.In the section “Allocation file format”, the document tells “ The format contains three types of elements”, but it lists four types of elements following that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology
[ https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649862#comment-13649862 ] Luke Lu commented on YARN-18: - The latest patch is getting close. I think the naming of Abstract(App)?TopologyElementsFactory is both long and imprecise, as a ContainerRequest is not a topology element but a topology aware object. How about we simply call it (App)?TopologyAwareFactory? This a straightforward patch to make (hierarchical) topology pluggable. I'll commit this patch this week if there is no further objections. This is needed for YARN-19. Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology - Key: YARN-18 URL: https://issues.apache.org/jira/browse/YARN-18 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Junping Du Assignee: Junping Du Labels: features Attachments: HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.1.patch, YARN-18-v6.2.patch, YARN-18-v6.patch There are several classes in YARN’s container assignment and task scheduling algorithms that relate to data locality which were updated to give preference to running a container on other locality besides node-local and rack-local (like nodegroup-local). This propose to make these data structure/algorithms pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class ScheduledRequests was made a package level class to it would be easier to create a subclass, ScheduledRequestsWithNodeGroup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-422) Add NM client library
[ https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649869#comment-13649869 ] Bikas Saha commented on YARN-422: - I am not contending that the behavior is wrong. I am suggesting that it should be optional with the default set to true. If needed, it can be overridden to not do so. e.g. there is a pending change in AMRMClient to not unregister with the RM when the RM sends a reboot command. That is the correct behavior but it will have an override option because sometimes apps may actually want to unregister (race between reboot and app completion). Add NM client library - Key: YARN-422 URL: https://issues.apache.org/jira/browse/YARN-422 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Zhijie Shen Attachments: AMNMClient_Defination.txt, AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, YARN-422.1.patch, YARN-422.2.patch, YARN-422.3.patch Create a simple wrapper over the ContainerManager protocol to provide hide the details of the protocol implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-422) Add NM client library
[ https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649899#comment-13649899 ] Zhijie Shen commented on YARN-422: -- Well, giving users the choice to disable the feature sounds reasonable. I'll add the setting for it. Moreover, let me add some javadoc to describe this part. Add NM client library - Key: YARN-422 URL: https://issues.apache.org/jira/browse/YARN-422 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Zhijie Shen Attachments: AMNMClient_Defination.txt, AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, YARN-422.1.patch, YARN-422.2.patch, YARN-422.3.patch Create a simple wrapper over the ContainerManager protocol to provide hide the details of the protocol implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-582) Restore appToken for app attempt after RM restart
[ https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649910#comment-13649910 ] Vinod Kumar Vavilapalli commented on YARN-582: -- bq. Alternatively, it could Credentials so that no changes are needed for additional tokens. Inside the RM I'd like it to be explicit, so that we don't search in a credential cache each time we want a specific token. Restore appToken for app attempt after RM restart - Key: YARN-582 URL: https://issues.apache.org/jira/browse/YARN-582 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-582.1.patch, YARN-582.2.patch, YARN-582.3.patch These need to be saved and restored on a per app attempt basis. This is required only when work preserving restart is implemented for secure clusters. In non-preserving restart app attempts are killed and so this does not matter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-629) Make YarnRemoteException not be rooted at IOException
[ https://issues.apache.org/jira/browse/YARN-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649943#comment-13649943 ] Vinod Kumar Vavilapalli commented on YARN-629: -- +1, looks good. Checking it in. Make YarnRemoteException not be rooted at IOException - Key: YARN-629 URL: https://issues.apache.org/jira/browse/YARN-629 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-629.1.patch, YARN-629.2.patch, YARN-629.3.patch, YARN-629.4.patch After HADOOP-9343, it should be possible for YarnException to not be rooted at IOException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology
[ https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649947#comment-13649947 ] Arun C Murthy commented on YARN-18: --- Junping, there are lot of changes to digest here. Can you please provide a summary/writeup of your patch? This will help me review it. Thanks! Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology - Key: YARN-18 URL: https://issues.apache.org/jira/browse/YARN-18 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Junping Du Assignee: Junping Du Labels: features Attachments: HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.1.patch, YARN-18-v6.2.patch, YARN-18-v6.patch There are several classes in YARN’s container assignment and task scheduling algorithms that relate to data locality which were updated to give preference to running a container on other locality besides node-local and rack-local (like nodegroup-local). This propose to make these data structure/algorithms pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class ScheduledRequests was made a package level class to it would be easier to create a subclass, ScheduledRequestsWithNodeGroup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-578) NodeManager should use SecureIOUtils for serving logs and intermediate outputs
[ https://issues.apache.org/jira/browse/YARN-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649994#comment-13649994 ] Omkar Vinit Joshi commented on YARN-578: Fixing both comments and throwing different exceptions for different scenarios in ContainerLogsPage. Adding test. Verified it on Secure setup with NativeIO enabled. NodeManager should use SecureIOUtils for serving logs and intermediate outputs -- Key: YARN-578 URL: https://issues.apache.org/jira/browse/YARN-578 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Attachments: yarn-578-20130426.patch, YARN-578-20130506.patch Log servlets for serving logs and the ShuffleService for serving intermediate outputs both should use SecureIOUtils for avoiding symlink attacks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-578) NodeManager should use SecureIOUtils for serving logs and intermediate outputs
[ https://issues.apache.org/jira/browse/YARN-578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-578: --- Attachment: YARN-578-20130506.patch Adding krb5.conf file. Needed for Testing container logs page. NodeManager should use SecureIOUtils for serving logs and intermediate outputs -- Key: YARN-578 URL: https://issues.apache.org/jira/browse/YARN-578 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Attachments: yarn-578-20130426.patch, YARN-578-20130506.patch Log servlets for serving logs and the ShuffleService for serving intermediate outputs both should use SecureIOUtils for avoiding symlink attacks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-629) Make YarnRemoteException not be rooted at IOException
[ https://issues.apache.org/jira/browse/YARN-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-629: - Attachment: YARN-629-branch-2.txt Minor conflict on branch-2, this works on that branch. Compiled and ran TestNodeManagerShutDown which was conflicting. Make YarnRemoteException not be rooted at IOException - Key: YARN-629 URL: https://issues.apache.org/jira/browse/YARN-629 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-629.1.patch, YARN-629.2.patch, YARN-629.3.patch, YARN-629.4.patch, YARN-629-branch-2.txt After HADOOP-9343, it should be possible for YarnException to not be rooted at IOException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology
[ https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650054#comment-13650054 ] Luke Lu commented on YARN-18: - Please find the summary/writeup in the attached Pluggable topologies with NodeGroup for YARN.pdf Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology - Key: YARN-18 URL: https://issues.apache.org/jira/browse/YARN-18 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Junping Du Assignee: Junping Du Labels: features Attachments: HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.1.patch, YARN-18-v6.2.patch, YARN-18-v6.patch There are several classes in YARN’s container assignment and task scheduling algorithms that relate to data locality which were updated to give preference to running a container on other locality besides node-local and rack-local (like nodegroup-local). This propose to make these data structure/algorithms pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class ScheduledRequests was made a package level class to it would be easier to create a subclass, ScheduledRequestsWithNodeGroup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1
[ https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Riccomini updated YARN-614: - Attachment: YARN-614-4.patch Adding unit tests to latest patch. Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1 -- Key: YARN-614 URL: https://issues.apache.org/jira/browse/YARN-614 Project: Hadoop YARN Issue Type: Improvement Reporter: Bikas Saha Assignee: Chris Riccomini Fix For: 2.0.5-beta Attachments: YARN-614-0.patch, YARN-614-1.patch, YARN-614-2.patch, YARN-614-3.patch, YARN-614-4.patch Attempts can fail due to a large number of user errors and they should not be retried unnecessarily. The only reason YARN should retry an attempt is when the hardware fails or YARN has an error. NM failing, lost NM and NM disk errors are the hardware errors that come to mind. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1
[ https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650086#comment-13650086 ] Chris Riccomini commented on YARN-614: -- Bikas/Vinod: any more feedback? Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1 -- Key: YARN-614 URL: https://issues.apache.org/jira/browse/YARN-614 Project: Hadoop YARN Issue Type: Improvement Reporter: Bikas Saha Assignee: Chris Riccomini Fix For: 2.0.5-beta Attachments: YARN-614-0.patch, YARN-614-1.patch, YARN-614-2.patch, YARN-614-3.patch, YARN-614-4.patch Attempts can fail due to a large number of user errors and they should not be retried unnecessarily. The only reason YARN should retry an attempt is when the hardware fails or YARN has an error. NM failing, lost NM and NM disk errors are the hardware errors that come to mind. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-633) Change RMAdminProtocol api to throw IOException and YarnRemoteException
[ https://issues.apache.org/jira/browse/YARN-633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-633: --- Attachment: YARN-633.1.patch 1. Throw IOException and YarnRemoteException at RMAdminProtocol 2. Related method at RMAdmin.java had already throw IOException and YarnRemoteException. So, no need to change here 3. Change RMAdminProtocolPBServiceImpl.java to catch IOException, wrap and throw out ServiceException Change RMAdminProtocol api to throw IOException and YarnRemoteException --- Key: YARN-633 URL: https://issues.apache.org/jira/browse/YARN-633 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-633.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-632) Change ContainerManager api to throw IOException and YarnRemoteException
[ https://issues.apache.org/jira/browse/YARN-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-632: --- Attachment: YARN-632.1.patch 1. change ContainerManager, ContainerManagerPBServiceImpl and ContainerManagerImpl to throw IOException and YarnRemoteException 2. fix the test changes 3. No MR changes Change ContainerManager api to throw IOException and YarnRemoteException Key: YARN-632 URL: https://issues.apache.org/jira/browse/YARN-632 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-632.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-631) Change ClientRMProtocol api to throw IOException and YarnRemoteException
[ https://issues.apache.org/jira/browse/YARN-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650131#comment-13650131 ] Xuan Gong commented on YARN-631: MR changes are in MAPREDUCE-5212 Change ClientRMProtocol api to throw IOException and YarnRemoteException Key: YARN-631 URL: https://issues.apache.org/jira/browse/YARN-631 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-631.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-578) NodeManager should use SecureIOUtils for serving logs and intermediate outputs
[ https://issues.apache.org/jira/browse/YARN-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650133#comment-13650133 ] Vinod Kumar Vavilapalli commented on YARN-578: -- Okay, I just had an enlightening experience and I realized we need to fix more issues: - LogAggregationService can ignore these permissions and upload sensitive files! Please fix this and write a test to verify that it doesn't happen. - It seems like when logs are deleted, we are using the correct user to delete them. But can you write tests to validate this for two cases (1) when log-aggregation is enabled and (2) when it isn't. NodeManager should use SecureIOUtils for serving logs and intermediate outputs -- Key: YARN-578 URL: https://issues.apache.org/jira/browse/YARN-578 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Attachments: yarn-578-20130426.patch, YARN-578-20130506.patch Log servlets for serving logs and the ShuffleService for serving intermediate outputs both should use SecureIOUtils for avoiding symlink attacks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-578) NodeManager should use SecureIOUtils for serving and aggregating logs
[ https://issues.apache.org/jira/browse/YARN-578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-578: - Summary: NodeManager should use SecureIOUtils for serving and aggregating logs (was: NodeManager should use SecureIOUtils for serving logs and intermediate outputs) This ticket is only addressing logs - fixing the title. NodeManager should use SecureIOUtils for serving and aggregating logs - Key: YARN-578 URL: https://issues.apache.org/jira/browse/YARN-578 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Attachments: yarn-578-20130426.patch, YARN-578-20130506.patch Log servlets for serving logs and the ShuffleService for serving intermediate outputs both should use SecureIOUtils for avoiding symlink attacks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-421) Create asynchronous AM RM Client
[ https://issues.apache.org/jira/browse/YARN-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli resolved YARN-421. -- Resolution: Duplicate Assignee: (was: Bikas Saha) Dup of YARN-417. Create asynchronous AM RM Client Key: YARN-421 URL: https://issues.apache.org/jira/browse/YARN-421 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha The basic AM RM client hides the protocol and does request translation for the RM protocol. Creating a version of the client that can hide the heartbeat and provide notifications about interesting events like receiving containers etc will help unburden app developers from implementing this common scenario and focus on their app logic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-346) InvalidStateTransitonException: Invalid event: INIT_CONTAINER at DONE for ContainerImpl in Node Manager
[ https://issues.apache.org/jira/browse/YARN-346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650161#comment-13650161 ] Vinod Kumar Vavilapalli commented on YARN-346: -- Devaraj, any more information on this? When can this happen? Is this with MR over YARN or your custom AM? InvalidStateTransitonException: Invalid event: INIT_CONTAINER at DONE for ContainerImpl in Node Manager --- Key: YARN-346 URL: https://issues.apache.org/jira/browse/YARN-346 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.1-alpha, 2.0.0-alpha, 0.23.5 Reporter: Devaraj K Assignee: Devaraj K Priority: Critical {code:xml} 2013-01-16 23:55:52,067 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Can't handle this event at current state: Current: [DONE], eventType: [INIT_CONTAINER] org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: INIT_CONTAINER at DONE at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:819) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:71) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:504) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:497) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) 2013-01-16 23:55:52,067 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container: Container container_1358353581666_1326_01_10 transitioned from DONE to null {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-342) RM doesn't retry token renewals
[ https://issues.apache.org/jira/browse/YARN-342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-342: - Issue Type: Sub-task (was: Bug) Parent: YARN-47 RM doesn't retry token renewals --- Key: YARN-342 URL: https://issues.apache.org/jira/browse/YARN-342 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 3.0.0, 2.0.0-alpha, 0.23.6 Reporter: Daryn Sharp The RM stops trying to renew tokens if any exception occurs during the renew. This should be changed to abort only if the exception is {{InvalidToken}} to allow resilience to transient network failures, issues associated with aborted connections when the NN is overloaded, cluster upgrades, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-338) RM renews tokens even when maxDate will soon be exceeded
[ https://issues.apache.org/jira/browse/YARN-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-338: - Issue Type: Sub-task (was: Bug) Parent: YARN-47 RM renews tokens even when maxDate will soon be exceeded Key: YARN-338 URL: https://issues.apache.org/jira/browse/YARN-338 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 0.23.3, 3.0.0, 2.0.0-alpha Reporter: Daryn Sharp The RM will renew tokens 90% of the way to the next expiration. When the max lifetime is approaching, the next expiration is always the max lifetime. The RM starts to unnecessarily renew more and more frequent as that hard limit approaches. The RM should stop renewing when the last expiration matches the new expiration. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-503) DelegationTokens will be renewed forever if multiple jobs share tokens and the first one sets JOB_CANCEL_DELEGATION_TOKEN to false
[ https://issues.apache.org/jira/browse/YARN-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-503: - Issue Type: Sub-task (was: Bug) Parent: YARN-47 DelegationTokens will be renewed forever if multiple jobs share tokens and the first one sets JOB_CANCEL_DELEGATION_TOKEN to false -- Key: YARN-503 URL: https://issues.apache.org/jira/browse/YARN-503 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 0.23.3, 3.0.0, 2.0.0-alpha Reporter: Siddharth Seth Assignee: Daryn Sharp Attachments: YARN-503.patch, YARN-503.patch The first Job/App to register a token is the one which DelegationTokenRenewer associates with a a specific Token. An attempt to remove/cancel these shared tokens by subsequent jobs doesn't work - since the JobId will not match. As a result, Even if subsequent jobs have MRJobConfig.JOB_CANCEL_DELEGATION_TOKEN set to true - tokens will not be cancelled when those jobs complete. Tokens will eventually be removed from the RM / JT when the service that issued them considers them to have expired or via an explicit cancelDelegationTokens call (not implemented yet in 23). A side affect of this is that the same delegation token will end up being renewed multiple times (a separate TimerTask for each job which uses the token). DelegationTokenRenewer could maintain a reference count/list of jobIds for shared tokens. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-422) Add NM client library
[ https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650182#comment-13650182 ] Zhijie Shen commented on YARN-422: -- When I investigated using NMClient in M/R AM (see MAPREDUCE-5203), I found one limitation of the current design. Note that NMClientAsync doesn't execute the start/stop/query RPC calls immediately when the corresponding APIs are called. Instead, an event is scheduled to start a thread later. The users of NMClientAsync cannot define some logic to be run immediately before the RPC calls. For example, in ContainerLauncherImpl, the container state must be checked right before the PRC calls in launch() and kill(). To be logically correct, this logic cannot be move up to the place where the event is scheduled. Therefore, it's useful to let users define what to do immediately before the three RPC calls. I proposed to add three more APIs in CallbackHandler, and insert the hoots immediately before the three RPC calls, respectively. Add NM client library - Key: YARN-422 URL: https://issues.apache.org/jira/browse/YARN-422 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Zhijie Shen Attachments: AMNMClient_Defination.txt, AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, YARN-422.1.patch, YARN-422.2.patch, YARN-422.3.patch Create a simple wrapper over the ContainerManager protocol to provide hide the details of the protocol implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-422) Add NM client library
[ https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650190#comment-13650190 ] Bikas Saha commented on YARN-422: - That might be going down a slippery slope. IMO, while its a good idea to allow MR app to use this library, it may not be a good idea for the library to adhere to every idiosyncrasy of the MR app. The MR app has its own development legacy and it may not be possible to retro-fit a client onto it without major surgery on both sides. Thoughts? Add NM client library - Key: YARN-422 URL: https://issues.apache.org/jira/browse/YARN-422 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Zhijie Shen Attachments: AMNMClient_Defination.txt, AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, YARN-422.1.patch, YARN-422.2.patch, YARN-422.3.patch Create a simple wrapper over the ContainerManager protocol to provide hide the details of the protocol implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-422) Add NM client library
[ https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650194#comment-13650194 ] Vinod Kumar Vavilapalli commented on YARN-422: -- bq. Is it necessary for the library to stop all containers before stopping itself? I think the callers should have the option of stopping together with killing all containers or stopping without killing all containers. bq. Note that NMClientAsync doesn't execute the start/stop/query RPC calls immediately when the corresponding APIs are called. Instead, an event is scheduled to start a thread later. We don't need more handlers. That code is written to make sure that, for e.g., start doesn't happen on a container that is already asked to be stopped. We should just pull those changes into your library, they aren't arbitrary code that needs plugin points. Add NM client library - Key: YARN-422 URL: https://issues.apache.org/jira/browse/YARN-422 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Zhijie Shen Attachments: AMNMClient_Defination.txt, AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, YARN-422.1.patch, YARN-422.2.patch, YARN-422.3.patch Create a simple wrapper over the ContainerManager protocol to provide hide the details of the protocol implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-623) NodeManagers on RM web-app don't have diagnostic information
[ https://issues.apache.org/jira/browse/YARN-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal reassigned YARN-623: -- Assignee: Mayank Bansal NodeManagers on RM web-app don't have diagnostic information Key: YARN-623 URL: https://issues.apache.org/jira/browse/YARN-623 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal Labels: usability If RM for some reason asks NMs to shut-down or reboot, it will be very useful to show that information on the UI so that operators can know directly instead of login into machines and looking for logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-590) Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown
[ https://issues.apache.org/jira/browse/YARN-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650207#comment-13650207 ] Mayank Bansal commented on YARN-590: Taking it over from Vinod. Thanks, Mayank Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown --- Key: YARN-590 URL: https://issues.apache.org/jira/browse/YARN-590 Project: Hadoop YARN Issue Type: Improvement Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-590) Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown
[ https://issues.apache.org/jira/browse/YARN-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal reassigned YARN-590: -- Assignee: Mayank Bansal (was: Vinod Kumar Vavilapalli) Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown --- Key: YARN-590 URL: https://issues.apache.org/jira/browse/YARN-590 Project: Hadoop YARN Issue Type: Improvement Reporter: Vinod Kumar Vavilapalli Assignee: Mayank Bansal We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-563) Add application type to ApplicationReport
[ https://issues.apache.org/jira/browse/YARN-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal reassigned YARN-563: -- Assignee: Mayank Bansal Add application type to ApplicationReport -- Key: YARN-563 URL: https://issues.apache.org/jira/browse/YARN-563 Project: Hadoop YARN Issue Type: Sub-task Reporter: Thomas Weise Assignee: Mayank Bansal This field is needed to distinguish different types of applications (app master implementations). For example, we may run applications of type XYZ in a cluster alongside MR and would like to filter applications by type. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-582) Restore appToken for app attempt after RM restart
[ https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-582: - Attachment: YARN-582.4.patch The new patch: 1.cover restoring client token 2.move ApplicationAttemptStateDataPBImpl and ApplicationStateDataPBImpl to a new package resourcemanager.recovery.records.impl.pb 3.addressed previous comments Restore appToken for app attempt after RM restart - Key: YARN-582 URL: https://issues.apache.org/jira/browse/YARN-582 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-582.1.patch, YARN-582.2.patch, YARN-582.3.patch, YARN-582.4.patch These need to be saved and restored on a per app attempt basis. This is required only when work preserving restart is implemented for secure clusters. In non-preserving restart app attempts are killed and so this does not matter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-633) Change RMAdminProtocol api to throw IOException and YarnRemoteException
[ https://issues.apache.org/jira/browse/YARN-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650220#comment-13650220 ] Vinod Kumar Vavilapalli commented on YARN-633: -- +1, trivial patch. Let's see what Jenkins says.. Change RMAdminProtocol api to throw IOException and YarnRemoteException --- Key: YARN-633 URL: https://issues.apache.org/jira/browse/YARN-633 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-633.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-582) Restore appToken for app attempt after RM restart
[ https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-582: - Attachment: YARN-582.5.patch Restore appToken for app attempt after RM restart - Key: YARN-582 URL: https://issues.apache.org/jira/browse/YARN-582 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-582.1.patch, YARN-582.2.patch, YARN-582.3.patch, YARN-582.4.patch, YARN-582.5.patch These need to be saved and restored on a per app attempt basis. This is required only when work preserving restart is implemented for secure clusters. In non-preserving restart app attempts are killed and so this does not matter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-632) Change ContainerManager api to throw IOException and YarnRemoteException
[ https://issues.apache.org/jira/browse/YARN-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650222#comment-13650222 ] Vinod Kumar Vavilapalli commented on YARN-632: -- Simple patch, +1. Will commit it if Jenkins blesses it too.. Change ContainerManager api to throw IOException and YarnRemoteException Key: YARN-632 URL: https://issues.apache.org/jira/browse/YARN-632 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-632.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-633) Change RMAdminProtocol api to throw IOException and YarnRemoteException
[ https://issues.apache.org/jira/browse/YARN-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650228#comment-13650228 ] Hadoop QA commented on YARN-633: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581950/YARN-633.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/860//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/860//console This message is automatically generated. Change RMAdminProtocol api to throw IOException and YarnRemoteException --- Key: YARN-633 URL: https://issues.apache.org/jira/browse/YARN-633 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-633.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-646) Some issues in Fair Scheduler's document
[ https://issues.apache.org/jira/browse/YARN-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650231#comment-13650231 ] Hadoop QA commented on YARN-646: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581897/YARN-646.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/861//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/861//console This message is automatically generated. Some issues in Fair Scheduler's document Key: YARN-646 URL: https://issues.apache.org/jira/browse/YARN-646 Project: Hadoop YARN Issue Type: Bug Components: documentation Affects Versions: 2.0.4-alpha Reporter: Dapeng Sun Fix For: 2.0.5-beta Attachments: YARN-646.patch, YARN-646.patch Issues are found in the doc page for Fair Scheduler http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html: 1.In the section “Configuration”, It contains two properties named “yarn.scheduler.fair.minimum-allocation-mb”, the second one should be “yarn.scheduler.fair.maximum-allocation-mb” 2.In the section “Allocation file format”, the document tells “ The format contains three types of elements”, but it lists four types of elements following that. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-633) Change RMAdminProtocol api to throw IOException and YarnRemoteException
[ https://issues.apache.org/jira/browse/YARN-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650232#comment-13650232 ] Xuan Gong commented on YARN-633: Let RMAdminProtocol api throw IOException and YarnRemoteException. No need to add new test cases Change RMAdminProtocol api to throw IOException and YarnRemoteException --- Key: YARN-633 URL: https://issues.apache.org/jira/browse/YARN-633 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-633.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1
[ https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650240#comment-13650240 ] Hadoop QA commented on YARN-614: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581949/YARN-614-4.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/867//console This message is automatically generated. Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1 -- Key: YARN-614 URL: https://issues.apache.org/jira/browse/YARN-614 Project: Hadoop YARN Issue Type: Improvement Reporter: Bikas Saha Assignee: Chris Riccomini Fix For: 2.0.5-beta Attachments: YARN-614-0.patch, YARN-614-1.patch, YARN-614-2.patch, YARN-614-3.patch, YARN-614-4.patch Attempts can fail due to a large number of user errors and they should not be retried unnecessarily. The only reason YARN should retry an attempt is when the hardware fails or YARN has an error. NM failing, lost NM and NM disk errors are the hardware errors that come to mind. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-647) historyServer can't show container's log when aggregation is not enabled
[ https://issues.apache.org/jira/browse/YARN-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shenhong updated YARN-647: -- Description: When yarn.log-aggregation-enable is seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable is seted to false. was: When yarn.log-aggregation-enable was seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable was seted to false. historyServer can't show container's log when aggregation is not enabled Key: YARN-647 URL: https://issues.apache.org/jira/browse/YARN-647 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 0.23.7, 2.0.4-alpha Environment: yarn.log-aggregation-enable=false , HistoryServer will show like this: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Reporter: shenhong Attachments: yarn-647.patch When yarn.log-aggregation-enable is seted to false, after a MR_App complete, we can't view the container's log from the HistoryServer, it shows message like: Aggregation is not enabled. Try the nodemanager at hd13-vm1:34669 Since we don't want to aggregate the container's log, because it will be a pressure to namenode. but sometimes we also want to take a look at container's log. Should we show the container's log across HistoryServer even if yarn.log-aggregation-enable is seted to false. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology
[ https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650362#comment-13650362 ] Hadoop QA commented on YARN-18: --- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581846/YARN-18-v6.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/871//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/871//console This message is automatically generated. Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology - Key: YARN-18 URL: https://issues.apache.org/jira/browse/YARN-18 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Junping Du Assignee: Junping Du Labels: features Attachments: HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.1.patch, YARN-18-v6.2.patch, YARN-18-v6.patch There are several classes in YARN’s container assignment and task scheduling algorithms that relate to data locality which were updated to give preference to running a container on other locality besides node-local and rack-local (like nodegroup-local). This propose to make these data structure/algorithms pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class ScheduledRequests was made a package level class to it would be easier to create a subclass, ScheduledRequestsWithNodeGroup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650369#comment-13650369 ] Hadoop QA commented on YARN-326: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581638/YARN-326-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/873//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/873//console This message is automatically generated. Add multi-resource scheduling to the fair scheduler --- Key: YARN-326 URL: https://issues.apache.org/jira/browse/YARN-326 Project: Hadoop YARN Issue Type: New Feature Components: scheduler Affects Versions: 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: FairSchedulerDRFDesignDoc-1.pdf, FairSchedulerDRFDesignDoc.pdf, YARN-326-1.patch, YARN-326-1.patch, YARN-326-2.patch, YARN-326.patch, YARN-326.patch With YARN-2 in, the capacity scheduler has the ability to schedule based on multiple resources, using dominant resource fairness. The fair scheduler should be able to do multiple resource scheduling as well, also using dominant resource fairness. More details to come on how the corner cases with fair scheduler configs such as min and max resources will be handled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-422) Add NM client library
[ https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650370#comment-13650370 ] Hadoop QA commented on YARN-422: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581791/YARN-422.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/874//console This message is automatically generated. Add NM client library - Key: YARN-422 URL: https://issues.apache.org/jira/browse/YARN-422 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Zhijie Shen Attachments: AMNMClient_Defination.txt, AMNMClient_Definition_Updated_With_Tests.txt, proposal_v1.pdf, YARN-422.1.patch, YARN-422.2.patch, YARN-422.3.patch Create a simple wrapper over the ContainerManager protocol to provide hide the details of the protocol implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-507) Add interface visibility and stability annotations to FS interfaces/classes
[ https://issues.apache.org/jira/browse/YARN-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650373#comment-13650373 ] Hadoop QA commented on YARN-507: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581734/yarn-507.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/875//console This message is automatically generated. Add interface visibility and stability annotations to FS interfaces/classes --- Key: YARN-507 URL: https://issues.apache.org/jira/browse/YARN-507 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Labels: scheduler Attachments: yarn-507.patch Many of FS classes/interfaces are missing annotations on visibility and stability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements
[ https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650378#comment-13650378 ] Vinod Kumar Vavilapalli commented on YARN-617: -- bq. Been reviewing YARN-582, so this will conflict with that patch, mainly the ones in AMLauncher. Linking tickets.. Actually, looking back, none of the AMLauncher changes should even be there - let's focus only on the containerToken stuff in this ticket. In unsercure mode, AM can fake resource requirements - Key: YARN-617 URL: https://issues.apache.org/jira/browse/YARN-617 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Priority: Minor Attachments: YARN-617.20130501.1.patch, YARN-617.20130501.patch, YARN-617.20130502.patch Without security, it is impossible to completely avoid AMs faking resources. We can at the least make it as difficult as possible by using the same container tokens and the RM-NM shared key mechanism over unauthenticated RM-NM channel. In the minimum, this will avoid accidental bugs in AMs in unsecure mode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-582) Restore appToken for app attempt after RM restart
[ https://issues.apache.org/jira/browse/YARN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650390#comment-13650390 ] Hadoop QA commented on YARN-582: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581994/YARN-582.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/876//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/876//console This message is automatically generated. Restore appToken for app attempt after RM restart - Key: YARN-582 URL: https://issues.apache.org/jira/browse/YARN-582 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-582.1.patch, YARN-582.2.patch, YARN-582.3.patch, YARN-582.4.patch, YARN-582.5.patch These need to be saved and restored on a per app attempt basis. This is required only when work preserving restart is implemented for secure clusters. In non-preserving restart app attempts are killed and so this does not matter. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1
[ https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650432#comment-13650432 ] Hadoop QA commented on YARN-614: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12581998/YARN-614-5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1367 javac compiler warnings (more than the trunk's current 1365 warnings). {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/877//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/877//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/877//console This message is automatically generated. Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1 -- Key: YARN-614 URL: https://issues.apache.org/jira/browse/YARN-614 Project: Hadoop YARN Issue Type: Improvement Reporter: Bikas Saha Assignee: Chris Riccomini Fix For: 2.0.5-beta Attachments: YARN-614-0.patch, YARN-614-1.patch, YARN-614-2.patch, YARN-614-3.patch, YARN-614-4.patch, YARN-614-5.patch Attempts can fail due to a large number of user errors and they should not be retried unnecessarily. The only reason YARN should retry an attempt is when the hardware fails or YARN has an error. NM failing, lost NM and NM disk errors are the hardware errors that come to mind. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology
[ https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-18: --- Attachment: YARN-18-v6.3.patch Minor changes to address Luke's recently comments. Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology - Key: YARN-18 URL: https://issues.apache.org/jira/browse/YARN-18 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Junping Du Assignee: Junping Du Labels: features Attachments: HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch, YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch, YARN-18-v6.1.patch, YARN-18-v6.2.patch, YARN-18-v6.3.patch, YARN-18-v6.patch There are several classes in YARN’s container assignment and task scheduling algorithms that relate to data locality which were updated to give preference to running a container on other locality besides node-local and rack-local (like nodegroup-local). This propose to make these data structure/algorithms pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class ScheduledRequests was made a package level class to it would be easier to create a subclass, ScheduledRequestsWithNodeGroup. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-649) Make container logs available over HTTP in plain text
Sandy Ryza created YARN-649: --- Summary: Make container logs available over HTTP in plain text Key: YARN-649 URL: https://issues.apache.org/jira/browse/YARN-649 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology
[ https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650474#comment-13650474 ] Hadoop QA commented on YARN-18: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582025/YARN-18-v6.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebApp org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueParsing org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup org.apache.hadoop.yarn.server.resourcemanager.TestClientRMService org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesCapacitySched org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue org.apache.hadoop.yarn.server.resourcemanager.security.TestClientTokens org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestParentQueue org.apache.hadoop.yarn.server.resourcemanager.security.TestApplicationTokens org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCResponseId org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSSchedulerApp org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.webapp.TestNodesPage {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/878//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/878//console This message is automatically generated. Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology - Key: YARN-18 URL: https://issues.apache.org/jira/browse/YARN-18 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.0.3-alpha Reporter: Junping Du Assignee: Junping Du Labels: features Attachments: HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, MAPREDUCE-4309-v7.patch, Pluggable topologies with NodeGroup for YARN.pdf, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, YARN-18-v4.3.patch,
[jira] [Updated] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-649: Attachment: YARN-649.patch Make container logs available over HTTP in plain text - Key: YARN-649 URL: https://issues.apache.org/jira/browse/YARN-649 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-649.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-649: Description: It would be good to make container logs available over the REST API for MAPREDUCE-4362 and so that they can be accessed programatically in general. Make container logs available over HTTP in plain text - Key: YARN-649 URL: https://issues.apache.org/jira/browse/YARN-649 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-649.patch It would be good to make container logs available over the REST API for MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650479#comment-13650479 ] Sandy Ryza commented on YARN-649: - I'm not sure about the best way to make this secure, so any pointers there would be helpful. Make container logs available over HTTP in plain text - Key: YARN-649 URL: https://issues.apache.org/jira/browse/YARN-649 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-649.patch It would be good to make container logs available over the REST API for MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-507) Add interface visibility and stability annotations to FS interfaces/classes
[ https://issues.apache.org/jira/browse/YARN-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-507: -- Attachment: yarn-507.patch Looks like I uploaded the wrong patch earlier. Uploading the write one now. Add interface visibility and stability annotations to FS interfaces/classes --- Key: YARN-507 URL: https://issues.apache.org/jira/browse/YARN-507 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Labels: scheduler Attachments: yarn-507.patch Many of FS classes/interfaces are missing annotations on visibility and stability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-507) Add interface visibility and stability annotations to FS interfaces/classes
[ https://issues.apache.org/jira/browse/YARN-507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-507: -- Attachment: (was: yarn-507.patch) Add interface visibility and stability annotations to FS interfaces/classes --- Key: YARN-507 URL: https://issues.apache.org/jira/browse/YARN-507 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Labels: scheduler Attachments: yarn-507.patch Many of FS classes/interfaces are missing annotations on visibility and stability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650492#comment-13650492 ] Hadoop QA commented on YARN-649: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582033/YARN-649.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/879//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/879//console This message is automatically generated. Make container logs available over HTTP in plain text - Key: YARN-649 URL: https://issues.apache.org/jira/browse/YARN-649 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-649.patch It would be good to make container logs available over the REST API for MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-507) Add interface visibility and stability annotations to FS interfaces/classes
[ https://issues.apache.org/jira/browse/YARN-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650493#comment-13650493 ] Hadoop QA commented on YARN-507: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12582034/yarn-507.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/880//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/880//console This message is automatically generated. Add interface visibility and stability annotations to FS interfaces/classes --- Key: YARN-507 URL: https://issues.apache.org/jira/browse/YARN-507 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Labels: scheduler Attachments: yarn-507.patch Many of FS classes/interfaces are missing annotations on visibility and stability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-507) Add interface visibility and stability annotations to FS interfaces/classes
[ https://issues.apache.org/jira/browse/YARN-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650510#comment-13650510 ] Karthik Kambatla commented on YARN-507: --- No tests because the patch concerns only annotations. Ran {{TestFairScheduler}} that uses {{FSSchedulerApp#getCurrentReservation()}} locally for sanity and QA ran other tests. Add interface visibility and stability annotations to FS interfaces/classes --- Key: YARN-507 URL: https://issues.apache.org/jira/browse/YARN-507 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Minor Labels: scheduler Attachments: yarn-507.patch Many of FS classes/interfaces are missing annotations on visibility and stability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers
[ https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650527#comment-13650527 ] Carlo Curino commented on YARN-45: -- bq. Would be great if you could add a version number to your patches. Sorry, we weren't sure of the current convention. {quote} - PreemptionMessage.strict should perhaps be named strictContract explicitly. You did name the setters and the getters verbosely which is good. - You should mark all the api getters and setters to be synchronized. There are similar locking bugs in other existing records too but we are tracking them elsewhere. - PreemptionContainer.getId() - Javadoc should refer to containers instead of Resource? - PreemptionContract.getContainers() - Javadoc referring to codeResourceManager/code may also include a @link PreemptionContract that, if satisfied, may replace these doesn't make sense to me. {quote} Fixed all of these; last one was a copy/paste of an older version of the code. Thanks for catching these. [~bikassaha]: we took another attempt at the javadoc, but it's probably still not sufficient. We opened YARN-XXX to track documentation of this feature in the AM how-to, which we'll address presently. (thanks everyone for the great feedback!) Scheduler feedback to AM to release containers -- Key: YARN-45 URL: https://issues.apache.org/jira/browse/YARN-45 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Chris Douglas Assignee: Carlo Curino Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf The ResourceManager strikes a balance between cluster utilization and strict enforcement of resource invariants in the cluster. Individual allocations of containers must be reclaimed- or reserved- to restore the global invariants when cluster load shifts. In some cases, the ApplicationMaster can respond to fluctuations in resource availability without losing the work already completed by that task (MAPREDUCE-4584). Supplying it with this information would be helpful for overall cluster utilization [1]. To this end, we want to establish a protocol for the RM to ask the AM to release containers. [1] http://research.yahoo.com/files/yl-2012-003.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-45) Scheduler feedback to AM to release containers
[ https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-45: - Attachment: YARN-45.1.patch Scheduler feedback to AM to release containers -- Key: YARN-45 URL: https://issues.apache.org/jira/browse/YARN-45 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Chris Douglas Assignee: Carlo Curino Attachments: YARN-45.1.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45.patch, YARN-45_summary_of_alternatives.pdf The ResourceManager strikes a balance between cluster utilization and strict enforcement of resource invariants in the cluster. Individual allocations of containers must be reclaimed- or reserved- to restore the global invariants when cluster load shifts. In some cases, the ApplicationMaster can respond to fluctuations in resource availability without losing the work already completed by that task (MAPREDUCE-4584). Supplying it with this information would be helpful for overall cluster utilization [1]. To this end, we want to establish a protocol for the RM to ask the AM to release containers. [1] http://research.yahoo.com/files/yl-2012-003.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)
[ https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13650541#comment-13650541 ] Carlo Curino commented on YARN-569: --- Hi Bikas, I noticed that your patch and ours share a common architectural style, i.e., the preemption policy runs in a separate thread on a timer. Moreover, they also seem to mostly agree on the I/O to/from the policy, we both grab state from the CapacityScheduler (e.g., the root of the queues) as in input, and both trigger actions that affect the CapacityScheduler. In our design we tried to put the actions behind an event handler, but I think the ideas are very similar. In fact, I would guess that a good portion of your patch could be placed behind the ScheduleEditPolicy interface we defined. As I mentioned in some of our conversations, this is nice because the ScheduleEditPolicy API I think can be used also for other purposes (e.g., for a deadline-monitor, or an IO-starvation monitor, etc..). Basically to implement monitors that focus on specific (even orthogonal) properties of the schedule, and that can observe the cluster state through the CapacityScheduler viewpoint, and try to affect it somehow (via events in our design). As an example, imagine a deadline monitor trying to affect jobs' completion times, by tweaking capacity of the queues, or ordering of job in the queue etc. While I am not sure this API will see a broad public :-) it would be nice to agree on it. As for the specifics of what you do with all the enforcements stuff, I haven't read the code carefully enough to follow the details. Actually, if you have time to write a high-level summary of it and post it here, it would be useful to orient us through your patch. While I think it would be too convoluted to try to merge the two approaches, I would like to see whether, other than the SchedulerEditPolicy, there is more we can factor out to make your version of the policy easy to write. I know it wouldn't be hard to evolve this later on, as the code is rather isolated, but if we can do something that make it easier now I think is worth considering. CapacityScheduler: support for preemption (using a capacity monitor) Key: YARN-569 URL: https://issues.apache.org/jira/browse/YARN-569 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Reporter: Carlo Curino Assignee: Carlo Curino Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, preemption.2.patch, YARN-569.patch, YARN-569.patch There is a tension between the fast-pace reactive role of the CapacityScheduler, which needs to respond quickly to applications resource requests, and node updates, and the more introspective, time-based considerations needed to observe and correct for capacity balance. To this purpose we opted instead of hacking the delicate mechanisms of the CapacityScheduler directly to add support for preemption by means of a Capacity Monitor, which can be run optionally as a separate service (much like the NMLivelinessMonitor). The capacity monitor (similarly to equivalent functionalities in the fairness scheduler) operates running on intervals (e.g., every 3 seconds), observe the state of the assignment of resources to queues from the capacity scheduler, performs off-line computation to determine if preemption is needed, and how best to edit the current schedule to improve capacity, and generates events that produce four possible actions: # Container de-reservations # Resource-based preemptions # Container-based preemptions # Container killing The actions listed above are progressively more costly, and it is up to the policy to use them as desired to achieve the rebalancing goals. Note that due to the lag in the effect of these actions the policy should operate at the macroscopic level (e.g., preempt tens of containers from a queue) and not trying to tightly and consistently micromanage container allocations. - Preemption policy (ProportionalCapacityPreemptionPolicy): - Preemption policies are by design pluggable, in the following we present an initial policy (ProportionalCapacityPreemptionPolicy) we have been experimenting with. The ProportionalCapacityPreemptionPolicy behaves as follows: # it gathers from the scheduler the state of the queues, in particular, their current capacity, guaranteed capacity and pending requests (*) # if there are pending requests from queues that are under capacity it computes a new ideal balanced state (**) # it computes the set of preemptions needed to repair the current schedule and achieve capacity balance (accounting for natural completion rates, and respecting