[jira] [Created] (YARN-1023) [YARN-321] Weservices REST API's support for Application History
Devaraj K created YARN-1023: --- Summary: [YARN-321] Weservices REST API's support for Application History Key: YARN-1023 URL: https://issues.apache.org/jira/browse/YARN-1023 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-321 Reporter: Devaraj K Assignee: Devaraj K -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-994) HeartBeat thread in AMRMClientAsync does not handle runtime exception correctly
[ https://issues.apache.org/jira/browse/YARN-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728847#comment-13728847 ] Hudson commented on YARN-994: - SUCCESS: Integrated in Hadoop-Yarn-trunk #291 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/291/]) YARN-994. HeartBeat thread in AMRMClientAsync does not handle runtime exception correctly (Xuan Gong via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510070) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/async/impl/AMRMClientAsyncImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/AMRMClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestAMRMClientAsync.java HeartBeat thread in AMRMClientAsync does not handle runtime exception correctly --- Key: YARN-994 URL: https://issues.apache.org/jira/browse/YARN-994 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-994.1.patch, YARN-994.2.patch, YARN-994.3.patch YARN-654 performs sanity checks for parameters of public methods in AMRMClient. Those may create runtime exception. Currently, heartBeat thread in AMRMClientAsync only captures IOException and YarnException, and will not handle Runtime Exception properly. Possible solution can be: heartbeat thread will catch throwable and notify the callbackhandler thread via existing savedException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-994) HeartBeat thread in AMRMClientAsync does not handle runtime exception correctly
[ https://issues.apache.org/jira/browse/YARN-994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728892#comment-13728892 ] Hudson commented on YARN-994: - SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1508 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1508/]) YARN-994. HeartBeat thread in AMRMClientAsync does not handle runtime exception correctly (Xuan Gong via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1510070) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/async/impl/AMRMClientAsyncImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/AMRMClientImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/async/impl/TestAMRMClientAsync.java HeartBeat thread in AMRMClientAsync does not handle runtime exception correctly --- Key: YARN-994 URL: https://issues.apache.org/jira/browse/YARN-994 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-994.1.patch, YARN-994.2.patch, YARN-994.3.patch YARN-654 performs sanity checks for parameters of public methods in AMRMClient. Those may create runtime exception. Currently, heartBeat thread in AMRMClientAsync only captures IOException and YarnException, and will not handle Runtime Exception properly. Possible solution can be: heartbeat thread will catch throwable and notify the callbackhandler thread via existing savedException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1024) Define a virtual core unambigiously
Arun C Murthy created YARN-1024: --- Summary: Define a virtual core unambigiously Key: YARN-1024 URL: https://issues.apache.org/jira/browse/YARN-1024 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun C Murthy Assignee: Arun C Murthy We need to clearly define the meaning of a virtual core unambiguously so that it's easy to migrate applications between clusters. For e.g. here is Amazon EC2 definition of ECU: http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it We can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1025) NodeManager does not propagate java.library.path to launched child containers on Windows
Chris Nauroth created YARN-1025: --- Summary: NodeManager does not propagate java.library.path to launched child containers on Windows Key: YARN-1025 URL: https://issues.apache.org/jira/browse/YARN-1025 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Neither the NodeManager process itself nor the child container processes that it launches have the correct setting for java.library.path on Windows. This prevents the processes from loading native code from hadoop.dll. The native code is required for correct functioning on Windows (not optional), so this ultimately can cause failures in MapReduce jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728952#comment-13728952 ] Arun C Murthy commented on YARN-1024: - We need to push on YARN-160 and normalize to YARN Virtual Core (YVC) or ECU itself. Define a virtual core unambigiously --- Key: YARN-1024 URL: https://issues.apache.org/jira/browse/YARN-1024 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun C Murthy Assignee: Arun C Murthy We need to clearly define the meaning of a virtual core unambiguously so that it's easy to migrate applications between clusters. For e.g. here is Amazon EC2 definition of ECU: http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it We can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1024) Define a virtual core unambigiously
[ https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-1024: Description: We need to clearly define the meaning of a virtual core unambiguously so that it's easy to migrate applications between clusters. For e.g. here is Amazon EC2 definition of ECU: http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it Essentially we need to clearly define a YARN Virtual Core (YVC). Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* was: We need to clearly define the meaning of a virtual core unambiguously so that it's easy to migrate applications between clusters. For e.g. here is Amazon EC2 definition of ECU: http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it We can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* Define a virtual core unambigiously --- Key: YARN-1024 URL: https://issues.apache.org/jira/browse/YARN-1024 Project: Hadoop YARN Issue Type: Improvement Reporter: Arun C Murthy Assignee: Arun C Murthy We need to clearly define the meaning of a virtual core unambiguously so that it's easy to migrate applications between clusters. For e.g. here is Amazon EC2 definition of ECU: http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it Essentially we need to clearly define a YARN Virtual Core (YVC). Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.* -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS
[ https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728954#comment-13728954 ] Arun C Murthy commented on YARN-160: I'd like to push to get this done asap. [~t.st.clair] thanks for pointer to hwloc. In your opinion, should we directly use hwloc in YARN rather than invent our own? What has been your experience with hwloc? It does have an appropriate license (BSD - http://www.open-mpi.org/projects/hwloc/license.php) ... nodemanagers should obtain cpu/memory values from underlying OS --- Key: YARN-160 URL: https://issues.apache.org/jira/browse/YARN-160 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.1.0-beta As mentioned in YARN-2 *NM memory and CPU configs* Currently these values are coming from the config of the NM, we should be able to obtain those values from the OS (ie, in the case of Linux from /proc/meminfo /proc/cpuinfo). As this is highly OS dependent we should have an interface that obtains this information. In addition implementations of this interface should be able to specify a mem/cpu offset (amount of mem/cpu not to be avail as YARN resource), this would allow to reserve mem/cpu for the OS and other services outside of YARN containers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1025) NodeManager does not propagate java.library.path to launched child containers on Windows
[ https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728957#comment-13728957 ] Chris Nauroth commented on YARN-1025: - As a workaround, you can set the PATH environment variable to point to the directory containing hadoop.dll before launching NodeManager. The environment variable will propagate to child container processes, so this works. I have a question on how this is expected to work on Linux. I see code in the {{yarn}} shell script and {{LinuxContainerExecutor}} that propagates java.library.path to children, but I also see checks in {{YARNRunner#warnForJavaLibPath}} that warn if a caller is attempting to control java.library.path and state that the caller should set LD_LIBRARY_PATH instead. What is the expected way for this to work on Linux? If the expectation is for the caller to set LD_LIBRARY_PATH, then this issue is likely a Won't Fix. Setting PATH on Windows is roughly equivalent to setting LD_LIBRARY_PATH on Linux. I'm already using this technique successfully as a workaround. If the expectation is that the shell scripts set up java.library.path, and NodeManager propagates that to its children, then we do have a bug on Windows and need to make some code changes. The first part is changing {{yarn.cmd}} to set java.library.path. That's a small easy change that gets NodeManager to load hadoop.dll. To cover the child processes, we'll need additional code changes, probably in {{DefaultContainerExecutor}}. NodeManager does not propagate java.library.path to launched child containers on Windows Key: YARN-1025 URL: https://issues.apache.org/jira/browse/YARN-1025 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Neither the NodeManager process itself nor the child container processes that it launches have the correct setting for java.library.path on Windows. This prevents the processes from loading native code from hadoop.dll. The native code is required for correct functioning on Windows (not optional), so this ultimately can cause failures in MapReduce jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-353) Add Zookeeper-based store implementation for RMStateStore
[ https://issues.apache.org/jira/browse/YARN-353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729051#comment-13729051 ] Jian He commented on YARN-353: -- The findbug -1 might be related to the test case is non-synchronously referencing 'zkClient' and 'zkSessionTimeout', which may not be an issue. Add Zookeeper-based store implementation for RMStateStore - Key: YARN-353 URL: https://issues.apache.org/jira/browse/YARN-353 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Hitesh Shah Assignee: Bikas Saha Attachments: YARN-353.10.patch, YARN-353.1.patch, YARN-353.2.patch, YARN-353.3.patch, YARN-353.4.patch, YARN-353.5.patch, YARN-353.6.patch, YARN-353.7.patch, YARN-353.8.patch, YARN-353.9.patch Add store that write RM state data to ZK -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-1027: Assignee: nemon lou Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: nemon lou Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira