[jira] [Updated] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-1027: Assignee: Karthik Kambatla (was: nemon lou) Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1033) Expose RM active/standby state to web UI and metrics
nemon lou created YARN-1033: --- Summary: Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: nemon lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. RM web services shall refuse client request unless querying for RM state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-1033: Assignee: nemon lou Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: nemon lou Assignee: nemon lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. RM web services shall refuse client request unless querying for RM state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-1033: Description: Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. was: Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. RM web services shall refuse client request unless querying for RM state. Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: nemon lou Assignee: nemon lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730622#comment-13730622 ] Junping Du commented on YARN-1024: -- I would also prefer #1 in scheduling resources as #2 is only meaningful in charge/billing as [~philip] mentioned above. For #2, simple calculation like ECU (it is released in 2006/2007, but didn't change over 7 years which against Moore's law :)) has two common questioned scenarios below: - assignment of multiple slow p-cores (4 x 1G) to a single thread task (1 x 4G) asking for a fast core (mapping to multiple vcore) cannot help performance but a waste of cpu resource: unused core will still consume timer interrupts, and idle loop cause resources too. In addition, maintaining a consistent memory view among multiple vCPUs consume resources. All of these are unnecessary. Another case is that it is possible for OS CPU scheduler to migrate a single-threaded workload amongst multiple vCPUs, thereby losing cache locality. - assignment of single faster p-cores (1 x 4G) to multiple thread task asking for multiple slow core (4 x 1G), it will cause performance issues as Steve mentioned above and in YARN-972, too much overhead in process context switch and cache miss. #1 sounds more reasonable and 1 vcore don't have to be 1pcore, but could be mapped to 1 vCPU on virtualization and can be overcommit latter (with configured ratio) by virtualized platform. Define a virtual core unambigiously --- Key: YARN-1024 URL: https://issues.apache.org/jira/browse/YARN-1024 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun C Murthy Assignee: Arun C Murthy We need to clearly define the meaning of a virtual core unambiguously so that it's easy to migrate applications between clusters. For e.g. here is Amazon EC2 definition of ECU: http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it Essentially we need to clearly define a YARN Virtual Core (YVC). Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1031) JQuery UI components reference external css in branch-23
[ https://issues.apache.org/jira/browse/YARN-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730708#comment-13730708 ] Hudson commented on YARN-1031: -- SUCCESS: Integrated in Hadoop-Hdfs-0.23-Build #691 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/691/]) YARN-1031. JQuery UI components reference external css in branch-23 (jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510775) * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery/themes-1.8.16 * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery/themes-1.8.16/base * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/webapps/static/jquery/themes-1.8.16/base/jquery-ui.css JQuery UI components reference external css in branch-23 Key: YARN-1031 URL: https://issues.apache.org/jira/browse/YARN-1031 Project: Hadoop YARN Issue Type: Bug Affects Versions: 0.23.9 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Fix For: 0.23.10 Attachments: YARN-1031-branch-0.23.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-696) Enable multiple states to to be specified in Resource Manager apps REST call
[ https://issues.apache.org/jira/browse/YARN-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trevor Lorimer updated YARN-696: Attachment: (was: YARN-696.diff) Enable multiple states to to be specified in Resource Manager apps REST call Key: YARN-696 URL: https://issues.apache.org/jira/browse/YARN-696 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.4-alpha Reporter: Trevor Lorimer Assignee: Trevor Lorimer Priority: Trivial Within the YARN Resource Manager REST API the GET call which returns all Applications can be filtered by a single State query parameter (http://rm http address:port/ws/v1/cluster/apps). There are 8 possible states (New, Submitted, Accepted, Running, Finishing, Finished, Failed, Killed), if no state parameter is specified all states are returned, however if a sub-set of states is required then multiple REST calls are required (max. of 7). The proposal is to be able to specify multiple states in a single REST call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-696) Enable multiple states to to be specified in Resource Manager apps REST call
[ https://issues.apache.org/jira/browse/YARN-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Trevor Lorimer updated YARN-696: Attachment: YARN-696.diff Enable multiple states to to be specified in Resource Manager apps REST call Key: YARN-696 URL: https://issues.apache.org/jira/browse/YARN-696 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.4-alpha Reporter: Trevor Lorimer Assignee: Trevor Lorimer Priority: Trivial Attachments: YARN-696.diff Within the YARN Resource Manager REST API the GET call which returns all Applications can be filtered by a single State query parameter (http://rm http address:port/ws/v1/cluster/apps). There are 8 possible states (New, Submitted, Accepted, Running, Finishing, Finished, Failed, Killed), if no state parameter is specified all states are returned, however if a sub-set of states is required then multiple REST calls are required (max. of 7). The proposal is to be able to specify multiple states in a single REST call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-696) Enable multiple states to to be specified in Resource Manager apps REST call
[ https://issues.apache.org/jira/browse/YARN-696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730822#comment-13730822 ] Hadoop QA commented on YARN-696: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596352/YARN-696.diff against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1659//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1659//console This message is automatically generated. Enable multiple states to to be specified in Resource Manager apps REST call Key: YARN-696 URL: https://issues.apache.org/jira/browse/YARN-696 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.4-alpha Reporter: Trevor Lorimer Assignee: Trevor Lorimer Priority: Trivial Attachments: YARN-696.diff Within the YARN Resource Manager REST API the GET call which returns all Applications can be filtered by a single State query parameter (http://rm http address:port/ws/v1/cluster/apps). There are 8 possible states (New, Submitted, Accepted, Running, Finishing, Finished, Failed, Killed), if no state parameter is specified all states are returned, however if a sub-set of states is required then multiple REST calls are required (max. of 7). The proposal is to be able to specify multiple states in a single REST call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-90) NodeManager should identify failed disks becoming good back again
[ https://issues.apache.org/jira/browse/YARN-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730850#comment-13730850 ] Ravi Prakash commented on YARN-90: -- Do we know what we need to do for this JIRA? I can see in DirectoryCollection, we need to be able to remove from failedDirs, and be able to recognize this fact in LocalDirsHandler service. Would anything else need to be done? NodeManager should identify failed disks becoming good back again - Key: YARN-90 URL: https://issues.apache.org/jira/browse/YARN-90 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Ravi Gummadi MAPREDUCE-3121 makes NodeManager identify disk failures. But once a disk goes down, it is marked as failed forever. To reuse that disk (after it becomes good), NodeManager needs restart. This JIRA is to improve NodeManager to reuse good disks(which could be bad some time back). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1032) NPE in RackResolve
[ https://issues.apache.org/jira/browse/YARN-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730895#comment-13730895 ] Hadoop QA commented on YARN-1032: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596360/YARN-1032.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1660//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1660//console This message is automatically generated. NPE in RackResolve -- Key: YARN-1032 URL: https://issues.apache.org/jira/browse/YARN-1032 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.0.5-alpha Environment: linux Reporter: Lohit Vijayarenu Priority: Minor Attachments: YARN-1032.1.patch, YARN-1032.2.patch We found a case where our rack resolve script was not returning rack due to problem with resolving host address. This exception was see in RackResolver.java as NPE, ultimately caught in RMContainerAllocator. {noformat} 2013-08-01 07:11:37,708 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. java.lang.NullPointerException at org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:99) at org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:92) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assignMapsWithLocality(RMContainerAllocator.java:1039) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assignContainers(RMContainerAllocator.java:925) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.assign(RMContainerAllocator.java:861) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator$ScheduledRequests.access$400(RMContainerAllocator.java:681) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:243) at java.lang.Thread.run(Thread.java:722) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730961#comment-13730961 ] Eli Collins commented on YARN-1024: --- bq. vcores are optional anyway (only used in DRF) Sandy corrected me offline that while this is true for the CS it is not true for the FS, which by default (w/o DRF) will not schedule more containers worth of vcores than configured vcores (which seems like it could lead to under-utilization given that the default resource calculator only uses memory and not every container needs a whole core). By default the # vcores is the # cores on the machine and MR asks containers w/ 1 vcore so we effectively have vcore=pcore today as the default (re-inforced by the decision to remove the notion of pcore in YARN-782). Define a virtual core unambigiously --- Key: YARN-1024 URL: https://issues.apache.org/jira/browse/YARN-1024 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun C Murthy Assignee: Arun C Murthy We need to clearly define the meaning of a virtual core unambiguously so that it's easy to migrate applications between clusters. For e.g. here is Amazon EC2 definition of ECU: http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it Essentially we need to clearly define a YARN Virtual Core (YVC). Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1034) Remove experimental in the Fair Scheduler documentation
Sandy Ryza created YARN-1034: Summary: Remove experimental in the Fair Scheduler documentation Key: YARN-1034 URL: https://issues.apache.org/jira/browse/YARN-1034 Project: Hadoop YARN Issue Type: Task Components: documentation, scheduler Affects Versions: 2.1.0-beta Environment: The YARN Fair Scheduler is largely stable now, and should no longer be declared experimental. Reporter: Sandy Ryza Assignee: Karthik Kambatla -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-291) Dynamic resource configuration
[ https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731037#comment-13731037 ] Alejandro Abdelnur commented on YARN-291: - Are we talking about an admin call to the RM that would set a resources correction on per node basis and the RM would adjust the NM reported resource capacity based on this correction? This would not require changes in the NMs. And potentially the correction could be done on the node update event before it makes to the scheduler impl, thus transparent to the scheduler impl. And if we want to persist these corrections, this could be done by the RM itself. If I got things right I'm OK with the approach. Dynamic resource configuration -- Key: YARN-291 URL: https://issues.apache.org/jira/browse/YARN-291 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Labels: features Attachments: Elastic Resources for YARN-v0.2.pdf, YARN-291-AddClientRMProtocolToSetNodeResource-03.patch, YARN-291-all-v1.patch, YARN-291-core-HeartBeatAndScheduler-01.patch, YARN-291-JMXInterfaceOnNM-02.patch, YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch, YARN-291-YARNClientCommandline-04.patch The current Hadoop YARN resource management logic assumes per node resource is static during the lifetime of the NM process. Allowing run-time configuration on per node resource will give us finer granularity of resource elasticity. This allows Hadoop workloads to coexist with other workloads on the same hardware efficiently, whether or not the environment is virtualized. More background and design details can be found in attached proposal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS
[ https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur reassigned YARN-160: --- Assignee: (was: Alejandro Abdelnur) I have my hands full at the moment, I won't be able to take onto this one for a while. Making it unassigned in case somebody wants to take a stab to it. nodemanagers should obtain cpu/memory values from underlying OS --- Key: YARN-160 URL: https://issues.apache.org/jira/browse/YARN-160 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Fix For: 2.1.0-beta As mentioned in YARN-2 *NM memory and CPU configs* Currently these values are coming from the config of the NM, we should be able to obtain those values from the OS (ie, in the case of Linux from /proc/meminfo /proc/cpuinfo). As this is highly OS dependent we should have an interface that obtains this information. In addition implementations of this interface should be able to specify a mem/cpu offset (amount of mem/cpu not to be avail as YARN resource), this would allow to reserve mem/cpu for the OS and other services outside of YARN containers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.
[ https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731053#comment-13731053 ] Joseph Kniest commented on YARN-1019: - Hi, new to yarn, where do I look in the code base for this? YarnConfiguration validation for local disk path and http addresses. Key: YARN-1019 URL: https://issues.apache.org/jira/browse/YARN-1019 Project: Hadoop YARN Issue Type: Improvement Reporter: Omkar Vinit Joshi Priority: Minor Labels: newbie Today we are not validating certain configuration parameters set in yarn-site.xml. 1) Configurations related to paths... such as local-dirs, log-dirs.. Our NM crashes during startup if they are set to relative paths rather than absolute paths. To avoid such failures we can enforce checks (absolute paths) before startup . i.e. before we actually startup...( i.e. directory handler creating directories). 2) Also for all the parameters using hostname:port unless we are ok with default port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.
[ https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731060#comment-13731060 ] Omkar Vinit Joshi commented on YARN-1019: - Hi, Welcome to yarn group. Probably you can get started from here [Checkout Code|http://wiki.apache.org/hadoop/HowToContribute]. Subscribe to user / dev mailing list and ask questions there (General questions such as how to checkout code/ issues running code). Here we usually discuss the current issue related problems. To get started. Run YARN..simple map reduce program. Once you are familiar with this you can take up one of the tickets marked as newbie and start working on that. YarnConfiguration validation for local disk path and http addresses. Key: YARN-1019 URL: https://issues.apache.org/jira/browse/YARN-1019 Project: Hadoop YARN Issue Type: Improvement Reporter: Omkar Vinit Joshi Priority: Minor Labels: newbie Today we are not validating certain configuration parameters set in yarn-site.xml. 1) Configurations related to paths... such as local-dirs, log-dirs.. Our NM crashes during startup if they are set to relative paths rather than absolute paths. To avoid such failures we can enforce checks (absolute paths) before startup . i.e. before we actually startup...( i.e. directory handler creating directories). 2) Also for all the parameters using hostname:port unless we are ok with default port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1035) NPE when trying to create an error message response of RPC
[ https://issues.apache.org/jira/browse/YARN-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731063#comment-13731063 ] Steve Loughran commented on YARN-1035: -- {code} 8]] INFO DataNode.clienttrace (BlockSender.java:sendBlock(695)) - src: /127.0.0.1:58247, dest: /127.0.0.1:58308, bytes: 5439, op: HDFS_READ, cliID: DFSClient_NONMAPREDUCE_-539248485_697, offset: 0, srvID: DS-502087106-10.11.3.237-58247-1375813762260, blockid: BP-384257351-10.11.3.237-1375813760919:blk_1073741832_1008, duration: 293000 2013-08-06 11:29:30,802 [IPC Server handler 1 on 58224] INFO localizer.LocalizedResource (LocalizedResource.java:handle(196)) - Resource hdfs://localhost:58246/user/stevel/.hoya/cluster/TestLiveRegionService/generated/hbase-env.sh transitioned from DOWNLOADING to LOCALIZED 2013-08-06 11:29:30,802 [AsyncDispatcher event handler] INFO container.Container (ContainerImpl.java:handle(860)) - Container container_1375813755119_0001_01_02 transitioned from LOCALIZING to LOCALIZED 2013-08-06 11:29:30,921 [AsyncDispatcher event handler] INFO container.Container (ContainerImpl.java:handle(860)) - Container container_1375813755119_0001_01_02 transitioned from LOCALIZED to RUNNING 2013-08-06 11:29:31,140 [ContainersLauncher #0] INFO nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(189)) - launchContainer: [nice, -n, 0, bash, -c, /Users/stevel/Projects/Hortonworks/Projects/hoya/target/TestLiveRegionService/TestLiveRegionService-localDir-nm-0_0/usercache/stevel/appcache/application_1375813755119_0001/container_1375813755119_0001_01_02/default_container_executor.sh] 2013-08-06 11:29:31,169 [ProcessThread(sid:0 cport:-1):] INFO server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest(627)) - Got user-level KeeperException when processing sessionid:0x14054e3f67f0001 type:delete cxid:0x13 zxid:0xc txntype:-1 reqpath:n/a Error Path:/yarnapps_hoya_stevel_TestLiveRegionService/backup-masters/10.11.3.237,58296,1375813768541 Error:KeeperErrorCode = NoNode for /yarnapps_hoya_stevel_TestLiveRegionService/backup-masters/10.11.3.237,58296,1375813768541 2013-08-06 11:29:31,713 [Socket Reader #1 for port 58246] INFO ipc.Server (Server.java:doRead(800)) - IPC Server listener on 58246: readAndProcess from client 127.0.0.1 threw exception [java.lang.NullPointerException] java.lang.NullPointerException at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$Builder.setErrorMsg(RpcHeaderProtos.java:1843) at org.apache.hadoop.ipc.Server.setupResponse(Server.java:2330) at org.apache.hadoop.ipc.Server.access$2900(Server.java:121) at org.apache.hadoop.ipc.Server$Connection.doSaslReply(Server.java:1430) at org.apache.hadoop.ipc.Server$Connection.initializeAuthContext(Server.java:1548) at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1507) at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:791) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:590) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:565) 2013-08-06 11:29:31,729 [Socket Reader #1 for port 58246] INFO ipc.Server (Server.java:doRead(800)) - IPC Server listener on 58246: readAndProcess from client 127.0.0.1 threw exception [java.lang.NullPointerException] java.lang.NullPointerException at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$Builder.setErrorMsg(RpcHeaderProtos.java:1843) at org.apache.hadoop.ipc.Server.setupResponse(Server.java:2330) at org.apache.hadoop.ipc.Server.access$2900(Server.java:121) at org.apache.hadoop.ipc.Server$Connection.doSaslReply(Server.java:1430) at org.apache.hadoop.ipc.Server$Connection.initializeAuthContext(Server.java:1548) at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1507) at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:791) at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:590) at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:565) 2013-08-06 11:29:32,070 [ProcessThread(sid:0 cport:-1):] INFO server.PrepRequestProcessor (PrepRequestProcessor.java:pRequest2Txn(476)) - Processed session termination for sessionid: 0x14054e3f67f0001 {code} NPE when trying to create an error message response of RPC -- Key: YARN-1035 URL: https://issues.apache.org/jira/browse/YARN-1035 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Steve Loughran I'm seeing an NPE which is raised when the server is trying to create an error response to send back to the caller and there is no error text. The root
[jira] [Created] (YARN-1035) NPE when trying to create an error message response of RPC
Steve Loughran created YARN-1035: Summary: NPE when trying to create an error message response of RPC Key: YARN-1035 URL: https://issues.apache.org/jira/browse/YARN-1035 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Steve Loughran I'm seeing an NPE which is raised when the server is trying to create an error response to send back to the caller and there is no error text. The root cause is probably somewhere in SASL, but sending something back to the caller would seem preferable to NPE-ing server-side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1035) NPE when trying to create an error message response of RPC
[ https://issues.apache.org/jira/browse/YARN-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731064#comment-13731064 ] Steve Loughran commented on YARN-1035: -- Looking up the stack, it's in {code} private void doSaslReply(Exception ioe) throws IOException { setupResponse(authFailedResponse, authFailedCall, RpcStatusProto.FATAL, RpcErrorCodeProto.FATAL_UNAUTHORIZED, null, ioe.getClass().getName(), ioe.getLocalizedMessage()); responder.doRespond(authFailedCall); } {code} This code assumes that the {{ioe.getLocalizedMessage()}} always returns a non-null string. Some exceptions do return null. For a robust response, {{ioe.toString()}} should be used. NPE when trying to create an error message response of RPC -- Key: YARN-1035 URL: https://issues.apache.org/jira/browse/YARN-1035 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.1-beta Reporter: Steve Loughran I'm seeing an NPE which is raised when the server is trying to create an error response to send back to the caller and there is no error text. The root cause is probably somewhere in SASL, but sending something back to the caller would seem preferable to NPE-ing server-side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.
[ https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731106#comment-13731106 ] Joseph Kniest commented on YARN-1019: - Thanks, I've done all that, built the latest from source, kicked off sample mapreduce job, looking for where this is handled in the code YarnConfiguration validation for local disk path and http addresses. Key: YARN-1019 URL: https://issues.apache.org/jira/browse/YARN-1019 Project: Hadoop YARN Issue Type: Improvement Reporter: Omkar Vinit Joshi Priority: Minor Labels: newbie Today we are not validating certain configuration parameters set in yarn-site.xml. 1) Configurations related to paths... such as local-dirs, log-dirs.. Our NM crashes during startup if they are set to relative paths rather than absolute paths. To avoid such failures we can enforce checks (absolute paths) before startup . i.e. before we actually startup...( i.e. directory handler creating directories). 2) Also for all the parameters using hostname:port unless we are ok with default port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-985) Nodemanager should log where a resource was localized
[ https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-985: -- Attachment: YARN-985.branch-0.23.patch For branch-0.23 Nodemanager should log where a resource was localized - Key: YARN-985 URL: https://issues.apache.org/jira/browse/YARN-985 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, YARN-985.patch When a resource is localized, we should log WHERE on the local disk it was localized. This helps in debugging afterwards (e.g. if the disk was to go bad). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-985) Nodemanager should log where a resource was localized
[ https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-985: -- Attachment: YARN-985.patch This is for trunk. I've incorporated Omkar's suggestion now Nodemanager should log where a resource was localized - Key: YARN-985 URL: https://issues.apache.org/jira/browse/YARN-985 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, YARN-985.patch When a resource is localized, we should log WHERE on the local disk it was localized. This helps in debugging afterwards (e.g. if the disk was to go bad). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.
[ https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731120#comment-13731120 ] Omkar Vinit Joshi commented on YARN-1019: - Start with YarnConfiguration.java. track all the places from where it is getting used and fix all path related and host:port checks. Also once done upload a patch. Someone will take a look at it. Make sure your patch file is something like jira-number-date-in--mm-dd.number.patch format. It will help reviewers. Also make sure your code is formatted well. Make sure your changes are as minimal as possible. You are set then. Start contributing!! YarnConfiguration validation for local disk path and http addresses. Key: YARN-1019 URL: https://issues.apache.org/jira/browse/YARN-1019 Project: Hadoop YARN Issue Type: Improvement Reporter: Omkar Vinit Joshi Priority: Minor Labels: newbie Today we are not validating certain configuration parameters set in yarn-site.xml. 1) Configurations related to paths... such as local-dirs, log-dirs.. Our NM crashes during startup if they are set to relative paths rather than absolute paths. To avoid such failures we can enforce checks (absolute paths) before startup . i.e. before we actually startup...( i.e. directory handler creating directories). 2) Also for all the parameters using hostname:port unless we are ok with default port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-985) Nodemanager should log where a resource was localized
[ https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731125#comment-13731125 ] Omkar Vinit Joshi commented on YARN-985: +1 Nodemanager should log where a resource was localized - Key: YARN-985 URL: https://issues.apache.org/jira/browse/YARN-985 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, YARN-985.patch When a resource is localized, we should log WHERE on the local disk it was localized. This helps in debugging afterwards (e.g. if the disk was to go bad). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1008) MiniYARNCluster with multiple nodemanagers, all nodes have same key for allocations
[ https://issues.apache.org/jira/browse/YARN-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731136#comment-13731136 ] Alejandro Abdelnur commented on YARN-1008: -- [~vinodkv], I don't think the change should go beyond minicluster for the following reason as in a real cluster there is one NM per node. Said this, maybe what we should do is that AMs should be able to specify a HOST:PORT (which typically will be the DN HOST:PORT), in the case of Minicluster, we would need a mapping between DN HOST:PORT to NM HOST:PORT when processing the resource request. We should also support directly HOST:PORT without mapping for cases where MiniHDFS is not there. [~ojoshi], multiple NMs register with its nodeId which contains HOST:PORT, so you have multiple nodes in the minicluster. But the scheduler logic, in all schedulers, use the node.getHost() to do the scheduling, that is why you see it working fine, all nodes report the same host. The problem is, you have no control on which NM you get. The challenge is how do we get this to work nicely in minicluster and real setups without disruption. MiniYARNCluster with multiple nodemanagers, all nodes have same key for allocations --- Key: YARN-1008 URL: https://issues.apache.org/jira/browse/YARN-1008 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur While the NMs are keyed using the NodeId, the allocation is done based on the hostname. This makes the different nodes indistinguishable to the scheduler. There should be an option to enabled the host:port instead just port for allocations. The nodes reported to the AM should report the 'key' (host or host:port). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-985) Nodemanager should log where a resource was localized
[ https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731166#comment-13731166 ] Jonathan Eagles commented on YARN-985: -- Looks like we are all happy. Putting this in. Thanks, everybody. Nodemanager should log where a resource was localized - Key: YARN-985 URL: https://issues.apache.org/jira/browse/YARN-985 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, YARN-985.patch When a resource is localized, we should log WHERE on the local disk it was localized. This helps in debugging afterwards (e.g. if the disk was to go bad). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-985) Nodemanager should log where a resource was localized
[ https://issues.apache.org/jira/browse/YARN-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731212#comment-13731212 ] Hudson commented on YARN-985: - SUCCESS: Integrated in Hadoop-trunk-Commit #4221 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4221/]) YARN-985. Nodemanager should log where a resource was localized (Ravi Prakash via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1511100) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalResourcesTrackerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java Nodemanager should log where a resource was localized - Key: YARN-985 URL: https://issues.apache.org/jira/browse/YARN-985 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 3.0.0, 2.3.0, 0.23.10 Attachments: YARN-985.branch-0.23.patch, YARN-985.patch, YARN-985.patch When a resource is localized, we should log WHERE on the local disk it was localized. This helps in debugging afterwards (e.g. if the disk was to go bad). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1004) yarn.scheduler.minimum|maximum|increment-allocation-mb should have scheduler
[ https://issues.apache.org/jira/browse/YARN-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731249#comment-13731249 ] Alejandro Abdelnur commented on YARN-1004: -- bq. Isn't it simpler for FS to ignore the existing configs? It is simpler, but it is not correct. it will create confusion due to misconfigurations when moving from one scheduler to another (either way). yarn.scheduler.minimum|maximum|increment-allocation-mb should have scheduler Key: YARN-1004 URL: https://issues.apache.org/jira/browse/YARN-1004 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Priority: Blocker Attachments: YARN-1004.patch As yarn.scheduler.minimum-allocation-mb is now a scheduler-specific configuration, and functions differently for the Fair and Capacity schedulers, it would be less confusing for the config names to include the scheduler names, i.e. yarn.scheduler.fair.minimum-allocation-mb, yarn.scheduler.capacity.minimum-allocation-mb, and yarn.scheduler.fifo.minimum-allocation-mb. The same goes for yarn.scheduler.increment-allocation-mb, which only exists for the Fair Scheduler, and yarn.scheduler.maximum-allocation-mb, for consistency. If we wish to preserve backwards compatibility, we can deprecate the old configs to the new ones. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-589) Expose a REST API for monitoring the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-589: Attachment: YARN-589-2.patch Expose a REST API for monitoring the fair scheduler --- Key: YARN-589 URL: https://issues.apache.org/jira/browse/YARN-589 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, YARN-589.patch The fair scheduler should have an HTTP interface that exposes information such as applications per queue, fair shares, demands, current allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-1021: -- Attachment: YARN-1021-images.tar.gz YARN-1021-demo.tar.gz YARN-1021.pdf YARN-1021.pdf: simulator documentation. YARN-1021-demo.tar.gz: configuration (for YARN) and data used for a demo running. YARN-1021-images.tar.gz: images used by simulator site document. Yarn Scheduler Load Simulator - Key: YARN-1021 URL: https://issues.apache.org/jira/browse/YARN-1021 Project: Hadoop YARN Issue Type: New Feature Components: scheduler Reporter: Wei Yan Assignee: Wei Yan Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, YARN-1021.pdf The Yarn Scheduler is a fertile area of interest with different implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, several optimizations are also made to improve scheduler performance for different scenarios and workload. Each scheduler algorithm has its own set of features, and drives scheduling decisions by many factors, such as fairness, capacity guarantee, resource availability, etc. It is very important to evaluate a scheduler algorithm very well before we deploy it in a production cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling algorithm. Evaluating in a real cluster is always time and cost consuming, and it is also very hard to find a large-enough cluster. Hence, a simulator which can predict how well a scheduler algorithm for some specific workload would be quite useful. We want to build a Scheduler Load Simulator to simulate large-scale Yarn clusters and application loads in a single machine. This would be invaluable in furthering Yarn by providing a tool for researchers and developers to prototype new scheduler features and predict their behavior and performance with reasonable amount of confidence, there-by aiding rapid innovation. The simulator will exercise the real Yarn ResourceManager removing the network factor by simulating NodeManagers and ApplicationMasters via handling and dispatching NM/AMs heartbeat events from within the same JVM. To keep tracking of scheduler behavior and performance, a scheduler wrapper will wrap the real scheduler. The simulator will produce real time metrics while executing, including: * Resource usages for whole cluster and each queue, which can be utilized to configure cluster and queue's capacity. * The detailed application execution trace (recorded in relation to simulated time), which can be analyzed to understand/validate the scheduler behavior (individual jobs turn around time, throughput, fairness, capacity guarantee, etc). * Several key metrics of scheduler algorithm, such as time cost of each scheduler operation (allocate, handle, etc), which can be utilized by Hadoop developers to find the code spots and scalability limits. The simulator will provide real time charts showing the behavior of the scheduler and its performance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-1021: -- Description: The Yarn Scheduler is a fertile area of interest with different implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, several optimizations are also made to improve scheduler performance for different scenarios and workload. Each scheduler algorithm has its own set of features, and drives scheduling decisions by many factors, such as fairness, capacity guarantee, resource availability, etc. It is very important to evaluate a scheduler algorithm very well before we deploy it in a production cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling algorithm. Evaluating in a real cluster is always time and cost consuming, and it is also very hard to find a large-enough cluster. Hence, a simulator which can predict how well a scheduler algorithm for some specific workload would be quite useful. We want to build a Scheduler Load Simulator to simulate large-scale Yarn clusters and application loads in a single machine. This would be invaluable in furthering Yarn by providing a tool for researchers and developers to prototype new scheduler features and predict their behavior and performance with reasonable amount of confidence, there-by aiding rapid innovation. The simulator will exercise the real Yarn ResourceManager removing the network factor by simulating NodeManagers and ApplicationMasters via handling and dispatching NM/AMs heartbeat events from within the same JVM. To keep tracking of scheduler behavior and performance, a scheduler wrapper will wrap the real scheduler. The simulator will produce real time metrics while executing, including: * Resource usages for whole cluster and each queue, which can be utilized to configure cluster and queue's capacity. * The detailed application execution trace (recorded in relation to simulated time), which can be analyzed to understand/validate the scheduler behavior (individual jobs turn around time, throughput, fairness, capacity guarantee, etc). * Several key metrics of scheduler algorithm, such as time cost of each scheduler operation (allocate, handle, etc), which can be utilized by Hadoop developers to find the code spots and scalability limits. The simulator will provide real time charts showing the behavior of the scheduler and its performance. A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing how to use simulator to simulate Fair Scheduler and Capacity Scheduler. was: The Yarn Scheduler is a fertile area of interest with different implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, several optimizations are also made to improve scheduler performance for different scenarios and workload. Each scheduler algorithm has its own set of features, and drives scheduling decisions by many factors, such as fairness, capacity guarantee, resource availability, etc. It is very important to evaluate a scheduler algorithm very well before we deploy it in a production cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling algorithm. Evaluating in a real cluster is always time and cost consuming, and it is also very hard to find a large-enough cluster. Hence, a simulator which can predict how well a scheduler algorithm for some specific workload would be quite useful. We want to build a Scheduler Load Simulator to simulate large-scale Yarn clusters and application loads in a single machine. This would be invaluable in furthering Yarn by providing a tool for researchers and developers to prototype new scheduler features and predict their behavior and performance with reasonable amount of confidence, there-by aiding rapid innovation. The simulator will exercise the real Yarn ResourceManager removing the network factor by simulating NodeManagers and ApplicationMasters via handling and dispatching NM/AMs heartbeat events from within the same JVM. To keep tracking of scheduler behavior and performance, a scheduler wrapper will wrap the real scheduler. The simulator will produce real time metrics while executing, including: * Resource usages for whole cluster and each queue, which can be utilized to configure cluster and queue's capacity. * The detailed application execution trace (recorded in relation to simulated time), which can be analyzed to understand/validate the scheduler behavior (individual jobs turn around time, throughput, fairness, capacity guarantee, etc). * Several key metrics of scheduler algorithm, such as time cost of each scheduler operation (allocate, handle, etc), which can be utilized by Hadoop developers to find the code spots and scalability limits. The simulator will provide real time charts showing the behavior of the scheduler and its performance. Yarn Scheduler Load Simulator
[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler
[ https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731436#comment-13731436 ] Hadoop QA commented on YARN-589: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596445/YARN-589-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1662//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1662//console This message is automatically generated. Expose a REST API for monitoring the fair scheduler --- Key: YARN-589 URL: https://issues.apache.org/jira/browse/YARN-589 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: fairscheduler.xml, YARN-589-1.patch, YARN-589-2.patch, YARN-589.patch The fair scheduler should have an HTTP interface that exposes information such as applications per queue, fair shares, demands, current allocations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.
[ https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731462#comment-13731462 ] Joseph Kniest commented on YARN-1019: - Ok so this module YarnConfiguration, does other portions of the codebase access this for the config info like directories and what not and I need to find all those places? How does that information get passed to this object? Ultimately, we want to find where this object gets instantiated and ensure that it doesn't get relative paths correct? What exactly do we want with number 2 of this issue? I'm confused about that one YarnConfiguration validation for local disk path and http addresses. Key: YARN-1019 URL: https://issues.apache.org/jira/browse/YARN-1019 Project: Hadoop YARN Issue Type: Improvement Reporter: Omkar Vinit Joshi Priority: Minor Labels: newbie Today we are not validating certain configuration parameters set in yarn-site.xml. 1) Configurations related to paths... such as local-dirs, log-dirs.. Our NM crashes during startup if they are set to relative paths rather than absolute paths. To avoid such failures we can enforce checks (absolute paths) before startup . i.e. before we actually startup...( i.e. directory handler creating directories). 2) Also for all the parameters using hostname:port unless we are ok with default port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker
Ravi Prakash created YARN-1036: -- Summary: Distributed Cache gives inconsistent result if cache files get deleted from task tracker Key: YARN-1036 URL: https://issues.apache.org/jira/browse/YARN-1036 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because that one had been closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker
[ https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-1036: --- Attachment: YARN-1036.branch-0.23.patch This is exactly the same patch as MAPREDUCE-4342. Distributed Cache gives inconsistent result if cache files get deleted from task tracker - Key: YARN-1036 URL: https://issues.apache.org/jira/browse/YARN-1036 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: YARN-1036.branch-0.23.patch This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because that one had been closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-1010) FairScheduler: decouple container scheduling from nodemanager heartbeats
[ https://issues.apache.org/jira/browse/YARN-1010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan reassigned YARN-1010: - Assignee: Wei Yan (was: Alejandro Abdelnur) FairScheduler: decouple container scheduling from nodemanager heartbeats Key: YARN-1010 URL: https://issues.apache.org/jira/browse/YARN-1010 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Wei Yan Priority: Critical Currently scheduling for a node is done when a node heartbeats. For large cluster where the heartbeat interval is set to several seconds this delays scheduling of incoming allocations significantly. We could have a continuous loop scanning all nodes and doing scheduling. If there is availability AMs will get the allocation in the next heartbeat after the one that placed the request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker
[ https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731470#comment-13731470 ] Hadoop QA commented on YARN-1036: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596459/YARN-1036.branch-0.23.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1664//console This message is automatically generated. Distributed Cache gives inconsistent result if cache files get deleted from task tracker - Key: YARN-1036 URL: https://issues.apache.org/jira/browse/YARN-1036 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: YARN-1036.branch-0.23.patch This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because that one had been closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1021) Yarn Scheduler Load Simulator
[ https://issues.apache.org/jira/browse/YARN-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731474#comment-13731474 ] Hadoop QA commented on YARN-1021: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596449/YARN-1021.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1163 javac compiler warnings (more than the trunk's current 1147 warnings). {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 28 new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 7 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-assemblies hadoop-tools/hadoop-sls hadoop-tools/hadoop-tools-dist. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1663//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/1663//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/1663//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-sls.html Javac warnings: https://builds.apache.org/job/PreCommit-YARN-Build/1663//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1663//console This message is automatically generated. Yarn Scheduler Load Simulator - Key: YARN-1021 URL: https://issues.apache.org/jira/browse/YARN-1021 Project: Hadoop YARN Issue Type: New Feature Components: scheduler Reporter: Wei Yan Assignee: Wei Yan Attachments: YARN-1021-demo.tar.gz, YARN-1021-images.tar.gz, YARN-1021.patch, YARN-1021.pdf The Yarn Scheduler is a fertile area of interest with different implementations, e.g., Fifo, Capacity and Fair schedulers. Meanwhile, several optimizations are also made to improve scheduler performance for different scenarios and workload. Each scheduler algorithm has its own set of features, and drives scheduling decisions by many factors, such as fairness, capacity guarantee, resource availability, etc. It is very important to evaluate a scheduler algorithm very well before we deploy it in a production cluster. Unfortunately, currently it is non-trivial to evaluate a scheduling algorithm. Evaluating in a real cluster is always time and cost consuming, and it is also very hard to find a large-enough cluster. Hence, a simulator which can predict how well a scheduler algorithm for some specific workload would be quite useful. We want to build a Scheduler Load Simulator to simulate large-scale Yarn clusters and application loads in a single machine. This would be invaluable in furthering Yarn by providing a tool for researchers and developers to prototype new scheduler features and predict their behavior and performance with reasonable amount of confidence, there-by aiding rapid innovation. The simulator will exercise the real Yarn ResourceManager removing the network factor by simulating NodeManagers and ApplicationMasters via handling and dispatching NM/AMs heartbeat events from within the same JVM. To keep tracking of scheduler behavior and performance, a scheduler wrapper will wrap the real scheduler. The simulator will produce real time metrics while executing, including: * Resource usages for whole cluster and each queue, which can be utilized to configure cluster and queue's capacity. * The detailed application execution trace (recorded in relation to simulated time), which can be analyzed to understand/validate the scheduler behavior (individual jobs turn around time, throughput, fairness, capacity guarantee, etc). * Several key metrics of scheduler algorithm, such as time cost of each scheduler operation (allocate, handle, etc), which can be utilized by Hadoop developers to find the code spots and scalability limits. The simulator will provide real time charts showing the behavior of the scheduler and its performance. A short demo is available http://www.youtube.com/watch?v=6thLi8q0qLE, showing how to use simulator to simulate Fair Scheduler and Capacity Scheduler. -- This message is automatically generated by JIRA. If you think it was
[jira] [Commented] (YARN-1036) Distributed Cache gives inconsistent result if cache files get deleted from task tracker
[ https://issues.apache.org/jira/browse/YARN-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731473#comment-13731473 ] Omkar Vinit Joshi commented on YARN-1036: - Thanks [~raviprak] Probably we need to isolate the logic for LOCALIZED and REQUEST scenarios? thoughts? {code} + if (rsrc != null (!isResourcePresent(rsrc))) { +LOG.info(Resource + rsrc.getLocalPath() ++ is missing, localizing it again); +localrsrc.remove(req); +rsrc = null; + } {code} the code is not required to be executed when a resource is getting LOCALIZED.. in trunk we have isolated them. Probably as in branch 0.23 we don't have anything like localCacheDirectoryManager it makes sense to just keep break...and do nothing in case it is LOCALIZED? {code} case LOCALIZED: break; case REQUEST: + if (rsrc != null (!isResourcePresent(rsrc))) { +LOG.info(Resource + rsrc.getLocalPath() ++ is missing, localizing it again); +localrsrc.remove(req); +rsrc = null; + } . {code} didn't review the test code. Distributed Cache gives inconsistent result if cache files get deleted from task tracker - Key: YARN-1036 URL: https://issues.apache.org/jira/browse/YARN-1036 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: YARN-1036.branch-0.23.patch This is a JIRA to backport MAPREDUCE-4342. I had to open a new JIRA because that one had been closed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-353: -- Attachment: YARN-353.11.patch Manually inspected the fields findbugs is complaining about - don't see any particular issues or additional need for synchronization. Uploading patch that adds exclusions for the two fields in question. Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731480#comment-13731480 ] Karthik Kambatla commented on YARN-353: --- YARN-353.11.patch is the patch with findbugs exclusions. Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731528#comment-13731528 ] Hadoop QA commented on YARN-353: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596465/YARN-353.11.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1665//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1665//console This message is automatically generated. Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.10.patch, YARN-353.11.patch, YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1019) YarnConfiguration validation for local disk path and http addresses.
[ https://issues.apache.org/jira/browse/YARN-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731551#comment-13731551 ] Omkar Vinit Joshi commented on YARN-1019: - [~josephkniest] to give you more insight into how it is used (General configuration read in hadoop). bq. Ok so this module YarnConfiguration, does other portions of the codebase access this for the config info like directories and what not and I need to find all those places? Probably not. If you are using eclipse for hadoop development then just do call hierarchy for the variable under consideration. say, for YarnConfiguration#RM_ADDRESS. You will see where it is getting used. That way your search is reduced. You should here ignore places where it is getting used inside test code. You don't need to validate test code. But you will have to add unit test case later to verify your changes. bq. How does that information get passed to this object? Probably you don't need to worry about this. You can trace {code} new YarnConfiguration() {code} call. It will read from configuration files. for YARN : yarn-site.xml, HDFS :hdfs-site.xml, CORE : core-site.xml bq. Ultimately, we want to find where this object gets instantiated and ensure that it doesn't get relative paths correct? Yes for all the places where we are getting file paths we need to ensure this. Make sure it is not OS specific. i.e. it works for WINDOWS/LINUX/MAC. bq. What exactly do we want with number 2 of this issue? I'm confused about that one when we expect for example RM_ADDRESS then we expect it to be host:port. just validate it. Finally once you have done the changes you need to create patch and upload it via More Actions -- Attach Files and then Submit Patch. YarnConfiguration validation for local disk path and http addresses. Key: YARN-1019 URL: https://issues.apache.org/jira/browse/YARN-1019 Project: Hadoop YARN Issue Type: Improvement Reporter: Omkar Vinit Joshi Priority: Minor Labels: newbie Today we are not validating certain configuration parameters set in yarn-site.xml. 1) Configurations related to paths... such as local-dirs, log-dirs.. Our NM crashes during startup if they are set to relative paths rather than absolute paths. To avoid such failures we can enforce checks (absolute paths) before startup . i.e. before we actually startup...( i.e. directory handler creating directories). 2) Also for all the parameters using hostname:port unless we are ok with default port. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1037) Create a helper function to create a local resource object given a path to file
Hitesh Shah created YARN-1037: - Summary: Create a helper function to create a local resource object given a path to file Key: YARN-1037 URL: https://issues.apache.org/jira/browse/YARN-1037 Project: Hadoop YARN Issue Type: Bug Reporter: Hitesh Shah A helper function that could given either a qualified or non-qualified path construct a local resource object. Should be available in one of the client library layers for developers to write against. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13731602#comment-13731602 ] Xuan Gong commented on YARN-899: Create a QueueACLsManager to save ApplicationId, CSQueue. Whenever the users try to get the applicationReport, list applications, kill applications thru commandLine, webservice or UI, queueACLsManager will check users' permission. Get queue administration ACLs working - Key: YARN-899 URL: https://issues.apache.org/jira/browse/YARN-899 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Xuan Gong Attachments: YARN-899.1.patch The Capacity Scheduler documents the yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option for controlling who can administer a queue, but it is not hooked up to anything. The Fair Scheduler could make use of a similar option as well. This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-899) Get queue administration ACLs working
[ https://issues.apache.org/jira/browse/YARN-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-899: --- Attachment: YARN-899.1.patch Get queue administration ACLs working - Key: YARN-899 URL: https://issues.apache.org/jira/browse/YARN-899 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Xuan Gong Attachments: YARN-899.1.patch The Capacity Scheduler documents the yarn.scheduler.capacity.root.queue-path.acl_administer_queue config option for controlling who can administer a queue, but it is not hooked up to anything. The Fair Scheduler could make use of a similar option as well. This is a feature-parity regression from MR1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira