[jira] [Commented] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.
[ https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762770#comment-13762770 ] Chuan Liu commented on YARN-1025: - +1 Thanks for the patch, Chris! Just to add some of my observations: we don't need to set this for mapred, hdfs, and hadoop cmd script files because they all use HADOOP_OPTS environment variable which is already set to include JAVA_LIBRARY_PATH in hadoop-config.cmd. This patch also matches Linux behavior -- we also explicitly set JAVA_LIBRARY_PATH in the Yarn Linux shell script. ResourceManager and NodeManager do not load native libraries on Windows. Key: YARN-1025 URL: https://issues.apache.org/jira/browse/YARN-1025 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Attachments: YARN-1025.1.patch ResourceManager and NodeManager do not have the correct setting for java.library.path when launched on Windows. This prevents the processes from loading native code from hadoop.dll. The native code is required for correct functioning on Windows (not optional), so this ultimately can cause failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762779#comment-13762779 ] Junping Du commented on YARN-1042: -- Attach a patch to showcase above proposal. This is only a demo patch and haven't included any unit tests so far. I think there are several open questions here before we move on next step: 1. this affinity/anti-affinity rule is bi-direction or not? If task A.affinity(B) is true then B.affinity(A) is always true or not? I guess it is not as A may prefer a list of nodes which makes the relationship non-symmetric. Also, that is how we can differ A prefer to live with B and C from A prefer to live B or C. Isn't it? 2. which rule's priority is higher in case affinity rule against with anti-affinity rule? In demo patch, affinity rule plays as higher priority but I am not sure if this is true in real case. Do we want to make it configurable? Or we just make sure rules updated later can override previous one if conflict. 3. Currently, the affinity/anti-affinity is only considered in node level, do we want to expand it to other level i.e. rack level in future? 4. The API now is to add a list of taskId as affinity/anti-affinity tasks. Is that easy to consume in application prospective? 5. the affinity/anti-affinity rules is a *must* conform rule in current implementation which may cause task starve for longer time, do we think about more leisure rule? Welcome to comments. Thx! add ability to specify affinity/anti-affinity in container requests --- Key: YARN-1042 URL: https://issues.apache.org/jira/browse/YARN-1042 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Assignee: Junping Du Attachments: YARN-1042-demo.patch container requests to the AM should be able to request anti-affinity to ensure that things like Region Servers don't come up on the same failure zones. Similarly, you may be able to want to specify affinity to same host or rack without specifying which specific host/rack. Example: bringing up a small giraph cluster in a large YARN cluster would benefit from having the processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762928#comment-13762928 ] Junping Du commented on YARN-1042: -- BTW, it seems the effort is more on application side. Do we think it is better to move to MAPREDUCE project? add ability to specify affinity/anti-affinity in container requests --- Key: YARN-1042 URL: https://issues.apache.org/jira/browse/YARN-1042 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Assignee: Junping Du Attachments: YARN-1042-demo.patch container requests to the AM should be able to request anti-affinity to ensure that things like Region Servers don't come up on the same failure zones. Similarly, you may be able to want to specify affinity to same host or rack without specifying which specific host/rack. Example: bringing up a small giraph cluster in a large YARN cluster would benefit from having the processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt
[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762944#comment-13762944 ] Hudson commented on YARN-292: - SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/328/]) YARN-292. Fixed FifoScheduler and FairScheduler to make their applications data structures thread safe to avoid RM crashing with ArrayIndexOutOfBoundsException. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521328) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt Key: YARN-292 URL: https://issues.apache.org/jira/browse/YARN-292 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Zhijie Shen Fix For: 2.1.1-beta Attachments: ArrayIndexOutOfBoundsException.log, YARN-292.1.patch, YARN-292.2.patch, YARN-292.3.patch, YARN-292.4.patch {code:xml} 2012-12-26 08:41:15,030 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Calling allocate on removed or non existant application appattempt_1356385141279_49525_01 2012-12-26 08:41:15,031 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1356385141279_49525 java.lang.ArrayIndexOutOfBoundsException: 0 at java.util.Arrays$ArrayList.get(Arrays.java:3381) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly,
[jira] [Commented] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt
[ https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762941#comment-13762941 ] Hudson commented on YARN-1152: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/328/]) YARN-1152. Fixed a bug in ResourceManager that was causing clients to get invalid client token key errors when an appliation is about to finish. Contributed by Jason Lowe. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521292) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java Invalid key to HMAC computation error when getting application report for completed app attempt --- Key: YARN-1152 URL: https://issues.apache.org/jira/browse/YARN-1152 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 2.1.1-beta Attachments: YARN-1152-2.txt, YARN-1152.txt On a secure cluster, an invalid key to HMAC error is thrown when trying to get an application report for an application with an attempt that has unregistered. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1144) Unmanaged AMs registering a tracking URI should not be proxy-fied
[ https://issues.apache.org/jira/browse/YARN-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762943#comment-13762943 ] Hudson commented on YARN-1144: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/328/]) YARN-1144. Unmanaged AMs registering a tracking URI should not be proxy-fied. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521039) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java Unmanaged AMs registering a tracking URI should not be proxy-fied - Key: YARN-1144 URL: https://issues.apache.org/jira/browse/YARN-1144 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 2.1.1-beta Attachments: YARN-1144.patch, YARN-1144.patch, YARN-1144.patch Unmanaged AMs do not run in the cluster, their tracking URL should not be proxy-fied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1049) ContainerExistStatus should define a status for preempted containers
[ https://issues.apache.org/jira/browse/YARN-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762940#comment-13762940 ] Hudson commented on YARN-1049: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/328/]) YARN-1049. ContainerExistStatus should define a status for preempted containers. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521036) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java ContainerExistStatus should define a status for preempted containers Key: YARN-1049 URL: https://issues.apache.org/jira/browse/YARN-1049 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Blocker Fix For: 2.1.1-beta Attachments: YARN-1049.patch With the current behavior is impossible to determine if a container has been preempted or lost due to a NM crash. Adding a PREEMPTED exit status (-102) will help an AM determine that a container has been preempted. Note the change of scope from the original summary/description. The original scope proposed API/behavior changes. Because we are passed 2.1.0-beta I'm reducing the scope of this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762942#comment-13762942 ] Hudson commented on YARN-910: - SUCCESS: Integrated in Hadoop-Yarn-trunk #328 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/328/]) YARN-910. Augmented auxiliary services to listen for container starts and completions in addition to application events. Contributed by Alejandro Abdelnur. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521298) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerInitializationContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerTerminationContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java Allow auxiliary services to listen for container starts and completions --- Key: YARN-910 URL: https://issues.apache.org/jira/browse/YARN-910 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Alejandro Abdelnur Fix For: 2.3.0 Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, YARN-910.patch Making container start and completion events available to auxiliary services would allow them to be resource-aware. The auxiliary service would be able to notify a co-located service that is opportunistically using free capacity of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated YARN-910: Fix Version/s: (was: 2.3.0) 2.1.1-beta Committed to branch-2.1-beta and changed fix-version. Allow auxiliary services to listen for container starts and completions --- Key: YARN-910 URL: https://issues.apache.org/jira/browse/YARN-910 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Alejandro Abdelnur Fix For: 2.1.1-beta Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, YARN-910.patch Making container start and completion events available to auxiliary services would allow them to be resource-aware. The auxiliary service would be able to notify a co-located service that is opportunistically using free capacity of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763040#comment-13763040 ] Hudson commented on YARN-910: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1518 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1518/]) YARN-910. Augmented auxiliary services to listen for container starts and completions in addition to application events. Contributed by Alejandro Abdelnur. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521298) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerInitializationContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerTerminationContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java Allow auxiliary services to listen for container starts and completions --- Key: YARN-910 URL: https://issues.apache.org/jira/browse/YARN-910 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Alejandro Abdelnur Fix For: 2.1.1-beta Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, YARN-910.patch Making container start and completion events available to auxiliary services would allow them to be resource-aware. The auxiliary service would be able to notify a co-located service that is opportunistically using free capacity of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt
[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763041#comment-13763041 ] Hudson commented on YARN-292: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1518 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1518/]) YARN-292. Fixed FifoScheduler and FairScheduler to make their applications data structures thread safe to avoid RM crashing with ArrayIndexOutOfBoundsException. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521328) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt Key: YARN-292 URL: https://issues.apache.org/jira/browse/YARN-292 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Zhijie Shen Fix For: 2.1.1-beta Attachments: ArrayIndexOutOfBoundsException.log, YARN-292.1.patch, YARN-292.2.patch, YARN-292.3.patch, YARN-292.4.patch {code:xml} 2012-12-26 08:41:15,030 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Calling allocate on removed or non existant application appattempt_1356385141279_49525_01 2012-12-26 08:41:15,031 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1356385141279_49525 java.lang.ArrayIndexOutOfBoundsException: 0 at java.util.Arrays$ArrayList.get(Arrays.java:3381) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly,
[jira] [Commented] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt
[ https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763039#comment-13763039 ] Hudson commented on YARN-1152: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1518 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1518/]) YARN-1152. Fixed a bug in ResourceManager that was causing clients to get invalid client token key errors when an appliation is about to finish. Contributed by Jason Lowe. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521292) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java Invalid key to HMAC computation error when getting application report for completed app attempt --- Key: YARN-1152 URL: https://issues.apache.org/jira/browse/YARN-1152 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 2.1.1-beta Attachments: YARN-1152-2.txt, YARN-1152.txt On a secure cluster, an invalid key to HMAC error is thrown when trying to get an application report for an application with an attempt that has unregistered. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt
[ https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763068#comment-13763068 ] Hudson commented on YARN-1152: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1544 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1544/]) YARN-1152. Fixed a bug in ResourceManager that was causing clients to get invalid client token key errors when an appliation is about to finish. Contributed by Jason Lowe. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521292) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java Invalid key to HMAC computation error when getting application report for completed app attempt --- Key: YARN-1152 URL: https://issues.apache.org/jira/browse/YARN-1152 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 2.1.1-beta Attachments: YARN-1152-2.txt, YARN-1152.txt On a secure cluster, an invalid key to HMAC error is thrown when trying to get an application report for an application with an attempt that has unregistered. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt
[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763070#comment-13763070 ] Hudson commented on YARN-292: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1544 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1544/]) YARN-292. Fixed FifoScheduler and FairScheduler to make their applications data structures thread safe to avoid RM crashing with ArrayIndexOutOfBoundsException. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521328) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt Key: YARN-292 URL: https://issues.apache.org/jira/browse/YARN-292 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Zhijie Shen Fix For: 2.1.1-beta Attachments: ArrayIndexOutOfBoundsException.log, YARN-292.1.patch, YARN-292.2.patch, YARN-292.3.patch, YARN-292.4.patch {code:xml} 2012-12-26 08:41:15,030 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Calling allocate on removed or non existant application appattempt_1356385141279_49525_01 2012-12-26 08:41:15,031 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1356385141279_49525 java.lang.ArrayIndexOutOfBoundsException: 0 at java.util.Arrays$ArrayList.get(Arrays.java:3381) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent
[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763069#comment-13763069 ] Hudson commented on YARN-910: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1544 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1544/]) YARN-910. Augmented auxiliary services to listen for container starts and completions in addition to application events. Contributed by Alejandro Abdelnur. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521298) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerInitializationContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerTerminationContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java Allow auxiliary services to listen for container starts and completions --- Key: YARN-910 URL: https://issues.apache.org/jira/browse/YARN-910 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Alejandro Abdelnur Fix For: 2.1.1-beta Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, YARN-910.patch Making container start and completion events available to auxiliary services would allow them to be resource-aware. The auxiliary service would be able to notify a co-located service that is opportunistically using free capacity of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763080#comment-13763080 ] Junping Du commented on YARN-1042: -- Hi [~ste...@apache.org], as you are the creator of this jira and probably consume this API in HOYA project. It is great if you can provide some input here. Thx! add ability to specify affinity/anti-affinity in container requests --- Key: YARN-1042 URL: https://issues.apache.org/jira/browse/YARN-1042 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Assignee: Junping Du Attachments: YARN-1042-demo.patch container requests to the AM should be able to request anti-affinity to ensure that things like Region Servers don't come up on the same failure zones. Similarly, you may be able to want to specify affinity to same host or rack without specifying which specific host/rack. Example: bringing up a small giraph cluster in a large YARN cluster would benefit from having the processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists
[ https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763263#comment-13763263 ] Zhijie Shen commented on YARN-609: -- Checked the three methods bellow. Though they're called addAll*, they seem to be used just as the setter in the context. Would you please check the references of them as well? If they're supposed to be the setter, I think it's good to modify the implementation as you did for other setters. * NodeHeartbeatResponsePBImpl#addAllContainersToCleanup * NodeHeartbeatResponsePBImpl#addAllApplicationsToCleanup * LocalizerStatusPBImpl#addAllResources Fix synchronization issues in APIs which take in lists -- Key: YARN-609 URL: https://issues.apache.org/jira/browse/YARN-609 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-609.1.patch, YARN-609.2.patch, YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch Some of the APIs take in lists and the setter-APIs don't always do proper synchronization. We need to fix these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
[ https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763315#comment-13763315 ] Zhijie Shen commented on YARN-978: -- The patch looks good, but it's better to add some javadoc for YarnApplicationAttemptState and ApplicationAttemptReport, because it's user-oriented. Another question is whether all RMAppAttemptState states are meaningful to users to have the 1-to-1 mapping. I've noticed that YarnApplicationState combined FINISHING and FINISHED. Thoughts? If we decided not to expose host, rpc port, and tracking url via rpc protocol, we should be consistent via web (YARN-954 and YARN-1023). [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation -- Key: YARN-978 URL: https://issues.apache.org/jira/browse/YARN-978 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Xuan Gong Fix For: YARN-321 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch, YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch We dont have ApplicationAttemptReport and Protobuf implementation. Adding that. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1119) Add ClusterMetrics checks to tho TestRMNodeTransitions tests
[ https://issues.apache.org/jira/browse/YARN-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-1119: Attachment: YARN-1119.patch Patch posted for trunk Add ClusterMetrics checks to tho TestRMNodeTransitions tests Key: YARN-1119 URL: https://issues.apache.org/jira/browse/YARN-1119 Project: Hadoop YARN Issue Type: Test Components: resourcemanager Affects Versions: 3.0.0, 0.23.9, 2.0.6-alpha Reporter: Robert Parker Assignee: Mit Desai Attachments: YARN-1119.patch, YARN-1119-v1-b23.patch YARN-1101 identified an issue where UNHEALTHY nodes could double decrement the active nodes. We should add checks for RUNNING node transitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1098) Separate out RM services into Always On and Active
[ https://issues.apache.org/jira/browse/YARN-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763297#comment-13763297 ] Hudson commented on YARN-1098: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4394 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4394/]) YARN-1098. Separate out RM services into Always On and Active (Karthik Kambatla via bikas) (bikas: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521560) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Separate out RM services into Always On and Active -- Key: YARN-1098 URL: https://issues.apache.org/jira/browse/YARN-1098 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Fix For: 2.3.0 Attachments: yarn-1098-1.patch, yarn-1098-2.patch, yarn-1098-3.patch, yarn-1098-4.patch, yarn-1098-5.patch, yarn-1098-approach.patch, yarn-1098-approach.patch From discussion on YARN-1027, it makes sense to separate out services that are stateful and stateless. The stateless services can run perennially irrespective of whether the RM is in Active/Standby state, while the stateful services need to be started on transitionToActive() and completely shutdown on transitionToStandby(). The external-facing stateless services should respond to the client/AM/NM requests depending on whether the RM is Active/Standby. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1119) Add ClusterMetrics checks to tho TestRMNodeTransitions tests
[ https://issues.apache.org/jira/browse/YARN-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763356#comment-13763356 ] Hadoop QA commented on YARN-1119: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602372/YARN-1119.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1887//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1887//console This message is automatically generated. Add ClusterMetrics checks to tho TestRMNodeTransitions tests Key: YARN-1119 URL: https://issues.apache.org/jira/browse/YARN-1119 Project: Hadoop YARN Issue Type: Test Components: resourcemanager Affects Versions: 3.0.0, 0.23.9, 2.0.6-alpha Reporter: Robert Parker Assignee: Mit Desai Attachments: YARN-1119.patch, YARN-1119-v1-b23.patch YARN-1101 identified an issue where UNHEALTHY nodes could double decrement the active nodes. We should add checks for RUNNING node transitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1119) Add ClusterMetrics checks to tho TestRMNodeTransitions tests
[ https://issues.apache.org/jira/browse/YARN-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763404#comment-13763404 ] Jonathan Eagles commented on YARN-1119: --- +1 lgtm. Thanks for the patches, Mit. Add ClusterMetrics checks to tho TestRMNodeTransitions tests Key: YARN-1119 URL: https://issues.apache.org/jira/browse/YARN-1119 Project: Hadoop YARN Issue Type: Test Components: resourcemanager Affects Versions: 3.0.0, 0.23.9, 2.0.6-alpha Reporter: Robert Parker Assignee: Mit Desai Attachments: YARN-1119.patch, YARN-1119-v1-b23.patch YARN-1101 identified an issue where UNHEALTHY nodes could double decrement the active nodes. We should add checks for RUNNING node transitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable
[ https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763431#comment-13763431 ] Hadoop QA commented on YARN-713: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602394/YARN-713.20130910.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1888//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1888//console This message is automatically generated. ResourceManager can exit unexpectedly if DNS is unavailable --- Key: YARN-713 URL: https://issues.apache.org/jira/browse/YARN-713 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Priority: Critical Fix For: 2.3.0 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, YARN-713.20130910.1.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and that ultimately would cause the RM to exit. The RM should not exit during DNS hiccups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1119) Add ClusterMetrics checks to tho TestRMNodeTransitions tests
[ https://issues.apache.org/jira/browse/YARN-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763432#comment-13763432 ] Hudson commented on YARN-1119: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4397 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4397/]) YARN-1119. Add ClusterMetrics checks to tho TestRMNodeTransitions tests (Mit Desai via jeagles) (jeagles: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521611) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java Add ClusterMetrics checks to tho TestRMNodeTransitions tests Key: YARN-1119 URL: https://issues.apache.org/jira/browse/YARN-1119 Project: Hadoop YARN Issue Type: Test Components: resourcemanager Affects Versions: 3.0.0, 0.23.9, 2.0.6-alpha Reporter: Robert Parker Assignee: Mit Desai Attachments: YARN-1119.patch, YARN-1119-v1-b23.patch YARN-1101 identified an issue where UNHEALTHY nodes could double decrement the active nodes. We should add checks for RUNNING node transitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists
[ https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763439#comment-13763439 ] Xuan Gong commented on YARN-609: Verified, they are used just as the setter. Fix synchronization issues in APIs which take in lists -- Key: YARN-609 URL: https://issues.apache.org/jira/browse/YARN-609 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-609.10.patch, YARN-609.1.patch, YARN-609.2.patch, YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch Some of the APIs take in lists and the setter-APIs don't always do proper synchronization. We need to fix these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-609) Fix synchronization issues in APIs which take in lists
[ https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-609: --- Attachment: YARN-609.10.patch Fix synchronization issues in APIs which take in lists -- Key: YARN-609 URL: https://issues.apache.org/jira/browse/YARN-609 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-609.10.patch, YARN-609.1.patch, YARN-609.2.patch, YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch Some of the APIs take in lists and the setter-APIs don't always do proper synchronization. We need to fix these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-867) Isolation of failures in aux services
[ https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763452#comment-13763452 ] Hadoop QA commented on YARN-867: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602396/YARN-867.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1889//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1889//console This message is automatically generated. Isolation of failures in aux services -- Key: YARN-867 URL: https://issues.apache.org/jira/browse/YARN-867 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Hitesh Shah Assignee: Xuan Gong Priority: Critical Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, YARN-867.sampleCode.2.patch Today, a malicious application can bring down the NM by sending bad data to a service. For example, sending data to the ShuffleService such that it results any non-IOException will cause the NM's async dispatcher to exit as the service's INIT APP event is not handled properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-867) Isolation of failures in aux services
[ https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-867: --- Attachment: YARN-867.3.patch Isolation of failures in aux services -- Key: YARN-867 URL: https://issues.apache.org/jira/browse/YARN-867 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Hitesh Shah Assignee: Xuan Gong Priority: Critical Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, YARN-867.sampleCode.2.patch Today, a malicious application can bring down the NM by sending bad data to a service. For example, sending data to the ShuffleService such that it results any non-IOException will cause the NM's async dispatcher to exit as the service's INIT APP event is not handled properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763470#comment-13763470 ] Karthik Kambatla commented on YARN-1027: Did some testing with several transitions to Standby and Active back and forth, and ran MR jobs when in Active mode. # The Standby mode (389719 objects worth 46661952 bytes) indeed has fewer objects and uses less memory compared to the Active mode (399819 objects worth 50104584 bytes). # The applicationId has the same timestamp from when the RM started, and starts issuing ids starting from 1. This leads to issues ranging from client-side failures due to entries in .staging/ to jobs hanging. Once enough jobs are killed, subsequent jobs can be run as usual. To address this, I think it is safe to reset the timestamp to when the RM becomes Active. # The WebUI behaves as expected. Regarding more involved tests, I was thinking of writing a MiniYARNCluster-based one that checks if the RPC servers are shutdown in Standby mode. We can check if a client can request applicationId etc. Is it okay for these tests to live in hadoop-yarn-client. Or, would it make sense to create a separate module for such end-to-end tests, including future HA tests, stress tests etc.? Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists
[ https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763473#comment-13763473 ] Hadoop QA commented on YARN-609: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602401/YARN-609.10.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1890//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1890//console This message is automatically generated. Fix synchronization issues in APIs which take in lists -- Key: YARN-609 URL: https://issues.apache.org/jira/browse/YARN-609 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-609.10.patch, YARN-609.1.patch, YARN-609.2.patch, YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch Some of the APIs take in lists and the setter-APIs don't always do proper synchronization. We need to fix these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-867) Isolation of failures in aux services
[ https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763423#comment-13763423 ] Xuan Gong commented on YARN-867: recreate the patch based on the latest trunk, and add new test case to test the logic. Remove the API onAuxServiceFailure, we already have onContainersCompleted() to take care of it. Isolation of failures in aux services -- Key: YARN-867 URL: https://issues.apache.org/jira/browse/YARN-867 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Hitesh Shah Assignee: Xuan Gong Priority: Critical Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, YARN-867.sampleCode.2.patch Today, a malicious application can bring down the NM by sending bad data to a service. For example, sending data to the ShuffleService such that it results any non-IOException will cause the NM's async dispatcher to exit as the service's INIT APP event is not handled properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable
[ https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-713: --- Attachment: YARN-713.20130910.1.patch ResourceManager can exit unexpectedly if DNS is unavailable --- Key: YARN-713 URL: https://issues.apache.org/jira/browse/YARN-713 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Priority: Critical Fix For: 2.3.0 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, YARN-713.20130910.1.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and that ultimately would cause the RM to exit. The RM should not exit during DNS hiccups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-713) ResourceManager can exit unexpectedly if DNS is unavailable
[ https://issues.apache.org/jira/browse/YARN-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763387#comment-13763387 ] Omkar Vinit Joshi commented on YARN-713: Fixing test case and findbug warning. ResourceManager can exit unexpectedly if DNS is unavailable --- Key: YARN-713 URL: https://issues.apache.org/jira/browse/YARN-713 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Omkar Vinit Joshi Priority: Critical Fix For: 2.3.0 Attachments: YARN-713.09052013.1.patch, YARN-713.09062013.1.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch, YARN-713.patch As discussed in MAPREDUCE-5261, there's a possibility that a DNS outage could lead to an unhandled exception in the ResourceManager's AsyncDispatcher, and that ultimately would cause the RM to exit. The RM should not exit during DNS hiccups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-890) The roundup for memory values on resource manager UI is misleading
[ https://issues.apache.org/jira/browse/YARN-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763533#comment-13763533 ] Zhijie Shen commented on YARN-890: -- The patch can ensure the UI shows the configured resource. Just think out loud. The problem happens because totalMB = allocatedMB + availableMB, and availableMB is rounded up, which only happens to CapacityScheduler. [~tdhavle], would you please confirm the problem only happens with CapacityScheduler? While it makes sense to round up resource request, why do we need to round up available memory? Let's say we have 100MB available, and the number will be rounded up to 1024MB. Should we allow to allocate another 1024MB container? In addition, availableMB seems to be only used by web now. The roundup for memory values on resource manager UI is misleading -- Key: YARN-890 URL: https://issues.apache.org/jira/browse/YARN-890 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Trupti Dhavle Assignee: Xuan Gong Attachments: Screen Shot 2013-07-10 at 10.43.34 AM.png, YARN-890.1.patch From the yarn-site.xml, I see following values- property nameyarn.nodemanager.resource.memory-mb/name value4192/value /property property nameyarn.scheduler.maximum-allocation-mb/name value4192/value /property property nameyarn.scheduler.minimum-allocation-mb/name value1024/value /property However the resourcemanager UI shows total memory as 5MB -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1078) TestNodeManagerResync, TestNodeManagerShutdown, and TestNodeStatusUpdater fail on Windows
[ https://issues.apache.org/jira/browse/YARN-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chuan Liu updated YARN-1078: Attachment: YARN-1078.2.patch I looked into the failure. It turns out we use InetAddress.getCanonicalHostName() to construct nodeId in ContainerManagerImpl. In the test, we assume this will always be localhost for a local loop back address, i.e. 127.0.0.1. However, this is not the case on Windows. As the method could return 127.0.0.1 on Windows instead of localhost. In the old patch, I switch from localhost to 127.0.0.1, and regressed Linux. Attach a new patch that uses getCanonicalHostName() to obtain the name for nodeId constructed in the tests. TestNodeManagerResync, TestNodeManagerShutdown, and TestNodeStatusUpdater fail on Windows - Key: YARN-1078 URL: https://issues.apache.org/jira/browse/YARN-1078 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.3.0 Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: YARN-1078.2.patch, YARN-1078.patch The three unit tests fail on Windows due to host name resolution differences on Windows, i.e. 127.0.0.1 does not resolve to host name localhost. {noformat} org.apache.hadoop.security.token.SecretManager$InvalidToken: Given Container container_0__01_00 identifier is not valid for current Node manager. Expected : 127.0.0.1:12345 Found : localhost:12345 {noformat} {noformat} testNMConnectionToRM(org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater) Time elapsed: 8343 sec FAILURE! org.junit.ComparisonFailure: expected:[localhost]:12345 but was:[127.0.0.1]:12345 at org.junit.Assert.assertEquals(Assert.java:125) at org.junit.Assert.assertEquals(Assert.java:147) at org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater$MyResourceTracker6.registerNodeManager(TestNodeStatusUpdater.java:712) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:212) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:149) at org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater$MyNodeStatusUpdater4.serviceStart(TestNodeStatusUpdater.java:369) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:101) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:213) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater.testNMConnectionToRM(TestNodeStatusUpdater.java:985) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.
[ https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763614#comment-13763614 ] Arpit Agarwal commented on YARN-1025: - +1 for the change. ResourceManager and NodeManager do not load native libraries on Windows. Key: YARN-1025 URL: https://issues.apache.org/jira/browse/YARN-1025 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Attachments: YARN-1025.1.patch ResourceManager and NodeManager do not have the correct setting for java.library.path when launched on Windows. This prevents the processes from loading native code from hadoop.dll. The native code is required for correct functioning on Windows (not optional), so this ultimately can cause failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763572#comment-13763572 ] Mayank Bansal commented on YARN-938: I ran these benchmarks with vinod's [~vinodkv] collabration . Thanks Vinod for all your help. Attaching the results. Thanks, Mayank Hadoop 2 benchmarking -- Key: YARN-938 URL: https://issues.apache.org/jira/browse/YARN-938 Project: Hadoop YARN Issue Type: Task Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls I am running the benchmarks on Hadoop 2 and will update the results soon. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1001) YARN should provide per application-type and state statistics
[ https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srimanth Gunturi updated YARN-1001: --- Priority: Critical (was: Major) Ambari needs atleast a way to get MapReduce app state counts. This is necessary for the upcoming Ambari release. YARN should provide per application-type and state statistics - Key: YARN-1001 URL: https://issues.apache.org/jira/browse/YARN-1001 Project: Hadoop YARN Issue Type: Task Components: api Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Zhijie Shen Priority: Critical Attachments: YARN-1001.1.patch, YARN-1001.2.patch In Ambari we plan to show for MR2 the number of applications finished, running, waiting, etc. It would be efficient if YARN could provide per application-type and state aggregated counts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-1171) Add defaultQueueSchedulingPolicy to Fair Scheduler documentation
[ https://issues.apache.org/jira/browse/YARN-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reassigned YARN-1171: -- Assignee: Karthik Kambatla Add defaultQueueSchedulingPolicy to Fair Scheduler documentation - Key: YARN-1171 URL: https://issues.apache.org/jira/browse/YARN-1171 Project: Hadoop YARN Issue Type: Improvement Components: documentation, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Karthik Kambatla The Fair Scheduler doc is missing the defaultQueueSchedulingPolicy property. I suspect there are a few other ones too that provide defaults for all queues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-938: --- Attachment: Hadoop-benchmarking-2.x-vs-1.x.xls Hadoop 2 benchmarking -- Key: YARN-938 URL: https://issues.apache.org/jira/browse/YARN-938 Project: Hadoop YARN Issue Type: Task Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls I am running the benchmarks on Hadoop 2 and will update the results soon. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srimanth Gunturi updated YARN-1166: --- Priority: Critical (was: Major) This JIRA is necessary for the upcoming Ambari release. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Akira AJISAKA Priority: Critical Attachments: YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1078) TestNodeManagerResync, TestNodeManagerShutdown, and TestNodeStatusUpdater fail on Windows
[ https://issues.apache.org/jira/browse/YARN-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763627#comment-13763627 ] Hadoop QA commented on YARN-1078: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602435/YARN-1078.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1891//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1891//console This message is automatically generated. TestNodeManagerResync, TestNodeManagerShutdown, and TestNodeStatusUpdater fail on Windows - Key: YARN-1078 URL: https://issues.apache.org/jira/browse/YARN-1078 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.3.0 Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: YARN-1078.2.patch, YARN-1078.patch The three unit tests fail on Windows due to host name resolution differences on Windows, i.e. 127.0.0.1 does not resolve to host name localhost. {noformat} org.apache.hadoop.security.token.SecretManager$InvalidToken: Given Container container_0__01_00 identifier is not valid for current Node manager. Expected : 127.0.0.1:12345 Found : localhost:12345 {noformat} {noformat} testNMConnectionToRM(org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater) Time elapsed: 8343 sec FAILURE! org.junit.ComparisonFailure: expected:[localhost]:12345 but was:[127.0.0.1]:12345 at org.junit.Assert.assertEquals(Assert.java:125) at org.junit.Assert.assertEquals(Assert.java:147) at org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater$MyResourceTracker6.registerNodeManager(TestNodeStatusUpdater.java:712) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at $Proxy26.registerNodeManager(Unknown Source) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:212) at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:149) at org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater$MyNodeStatusUpdater4.serviceStart(TestNodeStatusUpdater.java:369) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:101) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:213) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater.testNMConnectionToRM(TestNodeStatusUpdater.java:985) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1001) YARN should provide per application-type and state statistics
[ https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763634#comment-13763634 ] Xuan Gong commented on YARN-1001: - +1 Looks good YARN should provide per application-type and state statistics - Key: YARN-1001 URL: https://issues.apache.org/jira/browse/YARN-1001 Project: Hadoop YARN Issue Type: Task Components: api Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Zhijie Shen Priority: Critical Attachments: YARN-1001.1.patch, YARN-1001.2.patch In Ambari we plan to show for MR2 the number of applications finished, running, waiting, etc. It would be efficient if YARN could provide per application-type and state aggregated counts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.
[ https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763673#comment-13763673 ] Hudson commented on YARN-1025: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4398 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4398/]) YARN-1025. ResourceManager and NodeManager do not load native libraries on Windows. Contributed by Chris Nauroth. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521670) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd ResourceManager and NodeManager do not load native libraries on Windows. Key: YARN-1025 URL: https://issues.apache.org/jira/browse/YARN-1025 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: 3.0.0, 2.1.1-beta Attachments: YARN-1025.1.patch ResourceManager and NodeManager do not have the correct setting for java.library.path when launched on Windows. This prevents the processes from loading native code from hadoop.dll. The native code is required for correct functioning on Windows (not optional), so this ultimately can cause failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1001) YARN should provide per application-type and state statistics
[ https://issues.apache.org/jira/browse/YARN-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srimanth Gunturi updated YARN-1001: --- Priority: Blocker (was: Critical) YARN should provide per application-type and state statistics - Key: YARN-1001 URL: https://issues.apache.org/jira/browse/YARN-1001 Project: Hadoop YARN Issue Type: Task Components: api Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Zhijie Shen Priority: Blocker Attachments: YARN-1001.1.patch, YARN-1001.2.patch In Ambari we plan to show for MR2 the number of applications finished, running, waiting, etc. It would be efficient if YARN could provide per application-type and state aggregated counts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1149) NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING
[ https://issues.apache.org/jira/browse/YARN-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763670#comment-13763670 ] Zhijie Shen commented on YARN-1149: --- Conducted some investigation on the problem: 1. The following transition seems to be unnecessary, because APPLICATION_LOG_HANDLING_FINISHED can be emitted as early as after APPLICATION_STARTED is handled, when Application is already at INITING. {code} + .addTransition(ApplicationState.NEW, ApplicationState.FINISHED, + ApplicationEventType.APPLICATION_LOG_HANDLING_FINISHED, + new AppShutDownTransition()) {code} 2. The following message seems not to cover all the cases: {code} + LOG.info(Application + app.getAppId() + + is shutted down since NodeManager has been killed.); {code} In the normal case, APPLICATION_LOG_HANDLING_FINISHED is emitted after APPLICATION_FINISHED is handled, when Application is already at FINISHED. The two exceptions are: 1. NM is stopping, the running log aggregation job is signaled to stop early. In this case, this log info makes sense. 2. The running log aggregation job is interrupted. See the following code: {code} while (!this.appFinishing.get()) { synchronized(this) { try { wait(THREAD_SLEEP_TIME); } catch (InterruptedException e) { LOG.warn(PendingContainers queue is interrupted); this.appFinishing.set(true); } } } {code} In this case, the message seems not to be correct. 3. Should we do the following in AppShutDownTransition as well? This is because APPLICATION_LOG_HANDLING_FINISHED is consumed, there'll not be the transition from FINISHED-FINISHED on APPLICATION_LOG_HANDLING_FINISHED, and then the app will always be in the context. {code} app.context.getApplications().remove(appId); app.aclsManager.removeApplication(appId); {code} NM throws InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING - Key: YARN-1149 URL: https://issues.apache.org/jira/browse/YARN-1149 Project: Hadoop YARN Issue Type: Bug Reporter: Ramya Sunil Assignee: Xuan Gong Fix For: 2.1.1-beta Attachments: YARN-1149.1.patch When nodemanager receives a kill signal when an application has finished execution but log aggregation has not kicked in, InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING is thrown {noformat} 2013-08-25 20:45:00,875 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:finishLogAggregation(254)) - Application just finished : application_1377459190746_0118 2013-08-25 20:45:00,876 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:uploadLogsForContainer(105)) - Starting aggregate log-file for app application_1377459190746_0118 at /app-logs/foo/logs/application_1377459190746_0118/host_45454.tmp 2013-08-25 20:45:00,876 INFO logaggregation.LogAggregationService (LogAggregationService.java:stopAggregators(151)) - Waiting for aggregation to complete for application_1377459190746_0118 2013-08-25 20:45:00,891 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:uploadLogsForContainer(122)) - Uploading logs for container container_1377459190746_0118_01_04. Current good log dirs are /tmp/yarn/local 2013-08-25 20:45:00,915 INFO logaggregation.AppLogAggregatorImpl (AppLogAggregatorImpl.java:doAppLogAggregation(182)) - Finished aggregate log-file for app application_1377459190746_0118 2013-08-25 20:45:00,925 WARN application.Application (ApplicationImpl.java:handle(427)) - Can't handle this event at current state org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: APPLICATION_LOG_HANDLING_FINISHED at RUNNING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:425) at org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:59) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:697) at
[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Srimanth Gunturi updated YARN-1166: --- Priority: Blocker (was: Critical) YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Akira AJISAKA Priority: Blocker Attachments: YARN-1166.2.patch, YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1098) Separate out RM services into Always On and Active
[ https://issues.apache.org/jira/browse/YARN-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763683#comment-13763683 ] Karthik Kambatla commented on YARN-1098: [~bikassaha] and [~tucu00], thanks for the reviews. Separate out RM services into Always On and Active -- Key: YARN-1098 URL: https://issues.apache.org/jira/browse/YARN-1098 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Fix For: 2.3.0 Attachments: yarn-1098-1.patch, yarn-1098-2.patch, yarn-1098-3.patch, yarn-1098-4.patch, yarn-1098-5.patch, yarn-1098-approach.patch, yarn-1098-approach.patch From discussion on YARN-1027, it makes sense to separate out services that are stateful and stateless. The stateless services can run perennially irrespective of whether the RM is in Active/Standby state, while the stateful services need to be started on transitionToActive() and completely shutdown on transitionToStandby(). The external-facing stateless services should respond to the client/AM/NM requests depending on whether the RM is Active/Standby. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated YARN-1166: Attachment: YARN-1166.2.patch Attached a patch to pass TestLeafQueue. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Akira AJISAKA Priority: Critical Attachments: YARN-1166.2.patch, YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763653#comment-13763653 ] Vinod Kumar Vavilapalli commented on YARN-910: -- bq. Vinod Kumar Vavilapalli, thanks. Any reason not have this in for 2.1.1-beta? It's just that I didn't see a target version. And this is new functionality, so committed to 2.3 by default. I already see you merged into 2.1. Allow auxiliary services to listen for container starts and completions --- Key: YARN-910 URL: https://issues.apache.org/jira/browse/YARN-910 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Alejandro Abdelnur Fix For: 2.1.1-beta Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, YARN-910.patch Making container start and completion events available to auxiliary services would allow them to be resource-aware. The auxiliary service would be able to notify a co-located service that is opportunistically using free capacity of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763694#comment-13763694 ] Hadoop QA commented on YARN-1166: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602448/YARN-1166.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1892//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1892//console This message is automatically generated. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Akira AJISAKA Priority: Blocker Attachments: YARN-1166.2.patch, YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists
[ https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763557#comment-13763557 ] Zhijie Shen commented on YARN-609: -- +1 LGMT Fix synchronization issues in APIs which take in lists -- Key: YARN-609 URL: https://issues.apache.org/jira/browse/YARN-609 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-609.10.patch, YARN-609.1.patch, YARN-609.2.patch, YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch Some of the APIs take in lists and the setter-APIs don't always do proper synchronization. We need to fix these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1027) Implement RMHAProtocolService
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1027: --- Summary: Implement RMHAProtocolService (was: Implement RMHAServiceProtocol) Implement RMHAProtocolService - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1027) Implement RMHAProtocolService
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1027: --- Attachment: yarn-1027-6.patch Updated patch to add ha config to yarn-default.xml and have the RM#clusterTimeStamp reflect when the RM became Active. Submitting patch to check what Jenkins has to say. Implement RMHAProtocolService - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, yarn-1027-6.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-867) Isolation of failures in aux services
[ https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763749#comment-13763749 ] Zhijie Shen commented on YARN-867: -- How about issuing a KILL_CONTAINER event instead CONTAINER_EXITED_WITH_FAILURE, which is already handled at all container states. Otherwise, we need to add the transition from a number of states to EXITED_WITH_FAILURE. I'm not sure it is obvious to ensure the transition correct. Isolation of failures in aux services -- Key: YARN-867 URL: https://issues.apache.org/jira/browse/YARN-867 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Hitesh Shah Assignee: Xuan Gong Priority: Critical Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, YARN-867.sampleCode.2.patch Today, a malicious application can bring down the NM by sending bad data to a service. For example, sending data to the ShuffleService such that it results any non-IOException will cause the NM's async dispatcher to exit as the service's INIT APP event is not handled properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763798#comment-13763798 ] Sandy Ryza commented on YARN-938: - Thanks for working on these, [~mayank_bansal]. The results are pretty consistent with some internal benchmarking we've done at Cloudera. A few questions: * In MR1 was io.sort.record.percent tuned to spill the same number of times as MR2 does? * What was slowstart completed maps set to? * How many slots and MB were the TTs and NMs configured with? * Any idea what caused the improvement between RC1 and the final release? I'm guessing MAPREDUCE-5399 helped. Hadoop 2 benchmarking -- Key: YARN-938 URL: https://issues.apache.org/jira/browse/YARN-938 Project: Hadoop YARN Issue Type: Task Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls I am running the benchmarks on Hadoop 2 and will update the results soon. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763817#comment-13763817 ] Vinod Kumar Vavilapalli commented on YARN-938: -- bq. The results are pretty consistent with some internal benchmarking we've done at Cloudera. Interesting, do you mind sharing those results? Hadoop 2 benchmarking -- Key: YARN-938 URL: https://issues.apache.org/jira/browse/YARN-938 Project: Hadoop YARN Issue Type: Task Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls I am running the benchmarks on Hadoop 2 and will update the results soon. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763829#comment-13763829 ] Sandy Ryza commented on YARN-938: - On vacation now, but I'll try to assemble them into a presentable form when I get back. Hadoop 2 benchmarking -- Key: YARN-938 URL: https://issues.apache.org/jira/browse/YARN-938 Project: Hadoop YARN Issue Type: Task Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls I am running the benchmarks on Hadoop 2 and will update the results soon. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAProtocolService
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763832#comment-13763832 ] Hadoop QA commented on YARN-1027: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602463/yarn-1027-6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1893//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1893//console This message is automatically generated. Implement RMHAProtocolService - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, yarn-1027-6.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763838#comment-13763838 ] Nemon Lou commented on YARN-938: Thanks Mayank Bansal for your work.Do you mind sharing how much input data do you run for TeraSort? Hadoop 2 benchmarking -- Key: YARN-938 URL: https://issues.apache.org/jira/browse/YARN-938 Project: Hadoop YARN Issue Type: Task Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls I am running the benchmarks on Hadoop 2 and will update the results soon. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763848#comment-13763848 ] Junping Du commented on YARN-1042: -- Thanks for comments. Luke! bq. Although you can do a lot at the app side with container filtering, protocol and scheduler support will make it more efficient. I guess the intention of the jira is more for the latter, that affinity support should be app independent. Oh. It remind me that intention of JIRA may be on RM side (so the title may be replaced from container requests to resource request) as long lived services may have different AppMaster from default one that I change here. Also, I agree that do it in RM side may be more efficient as no need to return containers in app side for against affinity rules. However, my concern is it may take extra complexity to RM as it make RM aware the affinity/anti-affinity group of tasks (or resource request). IMO, one simplicity and beauty for YARN is: RM only take care abstracted resource request, and do container allocation accordingly. I am not sure if putting resource request into affinity/anti-affinity groups and tracking resource request relationship hurt this beauty. Thoughts? add ability to specify affinity/anti-affinity in container requests --- Key: YARN-1042 URL: https://issues.apache.org/jira/browse/YARN-1042 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Assignee: Junping Du Attachments: YARN-1042-demo.patch container requests to the AM should be able to request anti-affinity to ensure that things like Region Servers don't come up on the same failure zones. Similarly, you may be able to want to specify affinity to same host or rack without specifying which specific host/rack. Example: bringing up a small giraph cluster in a large YARN cluster would benefit from having the processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
[ https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763898#comment-13763898 ] Xuan Gong commented on YARN-978: bq.FINISHING to FINISHED is a bad merge, and indeed is a bug, will file a ticket. But agree with the general sentiment. LAUNCHED_UNMANAGED_SAVING can be mapped to ALLOCATED_SAVING, I actually think we don't need a separate LAUNCHED_UNMANAGED_SAVING, Unmanaged AM should directly go to ALLOCATED state on app-submission, will file a bug. The rest seem fine enough for me. Fixed. Now YarnApplicationAttemptState has FINISHED and FINISHING. Also mapping RMAppAttemptState. LAUNCHED_UNMANAGED_SAVING to YarnApplicationAttemptState. ALLOCATED_SAVING bq. I think we should add the host and port information for information purposes so that users can reason where their previous AMs ran and on what ports. The tracking url can be removed, instead a logs-url can be added like we have on the UI. Added host, rpc_port and logsUrl to applicationAttemptReport bq.No need to add more stuff to BuilderUtils. It was supposed to be dismantled. Removed bq.Do we really need the prefix APP_ATTEMPT_ in YarnApplicationAttemptStateProto? Yes, we have to. Just like we added APP prefix in FinalApplicationStatusProto. The reason is in protocolbuffers, enum values use C++ scoping rules, meaning that enum values are siblings of their type, not children of it. all enum values must be unique within the global scope, not just within YarnApplicationAttemptStateProto. [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation -- Key: YARN-978 URL: https://issues.apache.org/jira/browse/YARN-978 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Xuan Gong Fix For: YARN-321 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch, YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch We dont have ApplicationAttemptReport and Protobuf implementation. Adding that. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
[ https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-978: --- Attachment: YARN-978.7.patch [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation -- Key: YARN-978 URL: https://issues.apache.org/jira/browse/YARN-978 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Xuan Gong Fix For: YARN-321 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch, YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch, YARN-978.7.patch We dont have ApplicationAttemptReport and Protobuf implementation. Adding that. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-978) [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation
[ https://issues.apache.org/jira/browse/YARN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763946#comment-13763946 ] Hadoop QA commented on YARN-978: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602505/YARN-978.7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1894//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1894//console This message is automatically generated. [YARN-321] Adding ApplicationAttemptReport and Protobuf implementation -- Key: YARN-978 URL: https://issues.apache.org/jira/browse/YARN-978 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Xuan Gong Fix For: YARN-321 Attachments: YARN-978-1.patch, YARN-978.2.patch, YARN-978.3.patch, YARN-978.4.patch, YARN-978.5.patch, YARN-978.6.patch, YARN-978.7.patch We dont have ApplicationAttemptReport and Protobuf implementation. Adding that. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-867) Isolation of failures in aux services
[ https://issues.apache.org/jira/browse/YARN-867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764007#comment-13764007 ] Xuan Gong commented on YARN-867: bq. I think we should handle AuxServicesEventType.APPLICATION_INIT and the stop event in Application and not container. That should simplify THIS patch a lot. I did not see the benefits. So, when there is any auxServices fail in a container, we need to fail this container. If we handle the AuxServicesEventType in Application, eventually, from Application, we need to inform that certain container(not all the containers) to exit_with_failure. It will go to the same process as that we handle the it from container directly. If there is no difference, why do we increase the traffic (more events) for application ? Isolation of failures in aux services -- Key: YARN-867 URL: https://issues.apache.org/jira/browse/YARN-867 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Hitesh Shah Assignee: Xuan Gong Priority: Critical Attachments: YARN-867.1.sampleCode.patch, YARN-867.3.patch, YARN-867.sampleCode.2.patch Today, a malicious application can bring down the NM by sending bad data to a service. For example, sending data to the ShuffleService such that it results any non-IOException will cause the NM's async dispatcher to exit as the service's INIT APP event is not handled properly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira