[jira] [Updated] (YARN-1163) Cleanup code for AssignMapsWithLocality() in RMContainerAllocator
[ https://issues.apache.org/jira/browse/YARN-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1163: - Issue Type: Sub-task (was: Improvement) Parent: YARN-18 Cleanup code for AssignMapsWithLocality() in RMContainerAllocator - Key: YARN-1163 URL: https://issues.apache.org/jira/browse/YARN-1163 Project: Hadoop YARN Issue Type: Sub-task Components: applications Reporter: Junping Du Assignee: Junping Du Priority: Minor In RMContainerAllocator, AssignMapsWithLocality() is a very important method to assign map tasks on allocated containers with conforming different level of locality (dataLocal, rackLocal, etc.). However, this method messed with different code logic to handle different type of locality but have lots of similar behaviours. This is hard to maintain as well as do extension with other locality type, so we need some more clear code here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1160) allow admins to force app deployment on a specific host
[ https://issues.apache.org/jira/browse/YARN-1160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761723#comment-13761723 ] Steve Loughran commented on YARN-1160: -- -seems reasonable. The key thing is to ignore resource allocations, both in the demand of the app and the containers deployed -which lets people deploy static applications across the cluster, using YARN to push out the binaries report failures to an AM allow admins to force app deployment on a specific host --- Key: YARN-1160 URL: https://issues.apache.org/jira/browse/YARN-1160 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Priority: Minor Currently you ask YARN to get slots on a host and it finds a slot on that machine -or, if unavailable or there is no room, on a host nearby as far as the topology is concerned. People with admin rights should have the option to deploy a process on a specific host and have it run there even if there are no free slots -and to fail if the machine is not available. This would let you deploy admin-specific process across a cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1144) Unmanaged AMs registering a tracking URI should not be proxy-fied
[ https://issues.apache.org/jira/browse/YARN-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761732#comment-13761732 ] Tom White commented on YARN-1144: - +1 Unmanaged AMs registering a tracking URI should not be proxy-fied - Key: YARN-1144 URL: https://issues.apache.org/jira/browse/YARN-1144 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 2.1.1-beta Attachments: YARN-1144.patch, YARN-1144.patch, YARN-1144.patch Unmanaged AMs do not run in the cluster, their tracking URL should not be proxy-fied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1171) Add defaultQueueSchedulingPolicy to Fair Scheduler documentation
Sandy Ryza created YARN-1171: Summary: Add defaultQueueSchedulingPolicy to Fair Scheduler documentation Key: YARN-1171 URL: https://issues.apache.org/jira/browse/YARN-1171 Project: Hadoop YARN Issue Type: Improvement Components: documentation, scheduler Affects Versions: 2.1.0-beta Reporter: Sandy Ryza The Fair Scheduler doc is missing the defaultQueueSchedulingPolicy property. I suspect there are a few other ones too that provide defaults for all queues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1049) ContainerExistStatus should define a status for preempted containers
[ https://issues.apache.org/jira/browse/YARN-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761775#comment-13761775 ] Hudson commented on YARN-1049: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4388 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4388/]) YARN-1049. ContainerExistStatus should define a status for preempted containers. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521036) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java ContainerExistStatus should define a status for preempted containers Key: YARN-1049 URL: https://issues.apache.org/jira/browse/YARN-1049 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Blocker Fix For: 2.1.1-beta Attachments: YARN-1049.patch With the current behavior is impossible to determine if a container has been preempted or lost due to a NM crash. Adding a PREEMPTED exit status (-102) will help an AM determine that a container has been preempted. Note the change of scope from the original summary/description. The original scope proposed API/behavior changes. Because we are passed 2.1.0-beta I'm reducing the scope of this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1144) Unmanaged AMs registering a tracking URI should not be proxy-fied
[ https://issues.apache.org/jira/browse/YARN-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761780#comment-13761780 ] Hudson commented on YARN-1144: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4389 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4389/]) YARN-1144. Unmanaged AMs registering a tracking URI should not be proxy-fied. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521039) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java Unmanaged AMs registering a tracking URI should not be proxy-fied - Key: YARN-1144 URL: https://issues.apache.org/jira/browse/YARN-1144 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 2.1.1-beta Attachments: YARN-1144.patch, YARN-1144.patch, YARN-1144.patch Unmanaged AMs do not run in the cluster, their tracking URL should not be proxy-fied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1172) Convert *SecretManagers in the RM to services
Karthik Kambatla created YARN-1172: -- Summary: Convert *SecretManagers in the RM to services Key: YARN-1172 URL: https://issues.apache.org/jira/browse/YARN-1172 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1172) Convert *SecretManagers in the RM to services
[ https://issues.apache.org/jira/browse/YARN-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1172: --- Parent Issue: YARN-1139 (was: YARN-149) Convert *SecretManagers in the RM to services - Key: YARN-1172 URL: https://issues.apache.org/jira/browse/YARN-1172 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1144) Unmanaged AMs registering a tracking URI should not be proxy-fied
[ https://issues.apache.org/jira/browse/YARN-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761835#comment-13761835 ] Hudson commented on YARN-1144: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1517 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1517/]) YARN-1144. Unmanaged AMs registering a tracking URI should not be proxy-fied. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521039) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java Unmanaged AMs registering a tracking URI should not be proxy-fied - Key: YARN-1144 URL: https://issues.apache.org/jira/browse/YARN-1144 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 2.1.1-beta Attachments: YARN-1144.patch, YARN-1144.patch, YARN-1144.patch Unmanaged AMs do not run in the cluster, their tracking URL should not be proxy-fied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1049) ContainerExistStatus should define a status for preempted containers
[ https://issues.apache.org/jira/browse/YARN-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761832#comment-13761832 ] Hudson commented on YARN-1049: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1517 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1517/]) YARN-1049. ContainerExistStatus should define a status for preempted containers. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521036) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java ContainerExistStatus should define a status for preempted containers Key: YARN-1049 URL: https://issues.apache.org/jira/browse/YARN-1049 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Blocker Fix For: 2.1.1-beta Attachments: YARN-1049.patch With the current behavior is impossible to determine if a container has been preempted or lost due to a NM crash. Adding a PREEMPTED exit status (-102) will help an AM determine that a container has been preempted. Note the change of scope from the original summary/description. The original scope proposed API/behavior changes. Because we are passed 2.1.0-beta I'm reducing the scope of this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1049) ContainerExistStatus should define a status for preempted containers
[ https://issues.apache.org/jira/browse/YARN-1049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761866#comment-13761866 ] Hudson commented on YARN-1049: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1543 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1543/]) YARN-1049. ContainerExistStatus should define a status for preempted containers. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521036) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java ContainerExistStatus should define a status for preempted containers Key: YARN-1049 URL: https://issues.apache.org/jira/browse/YARN-1049 Project: Hadoop YARN Issue Type: Bug Components: api Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Blocker Fix For: 2.1.1-beta Attachments: YARN-1049.patch With the current behavior is impossible to determine if a container has been preempted or lost due to a NM crash. Adding a PREEMPTED exit status (-102) will help an AM determine that a container has been preempted. Note the change of scope from the original summary/description. The original scope proposed API/behavior changes. Because we are passed 2.1.0-beta I'm reducing the scope of this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1144) Unmanaged AMs registering a tracking URI should not be proxy-fied
[ https://issues.apache.org/jira/browse/YARN-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761869#comment-13761869 ] Hudson commented on YARN-1144: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1543 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1543/]) YARN-1144. Unmanaged AMs registering a tracking URI should not be proxy-fied. (tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521039) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java Unmanaged AMs registering a tracking URI should not be proxy-fied - Key: YARN-1144 URL: https://issues.apache.org/jira/browse/YARN-1144 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Priority: Critical Fix For: 2.1.1-beta Attachments: YARN-1144.patch, YARN-1144.patch, YARN-1144.patch Unmanaged AMs do not run in the cluster, their tracking URL should not be proxy-fied. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1163) Cleanup code for AssignMapsWithLocality() in RMContainerAllocator
[ https://issues.apache.org/jira/browse/YARN-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1163: - Attachment: YARN-1163-v1.patch Attach the first patch. Cleanup code for AssignMapsWithLocality() in RMContainerAllocator - Key: YARN-1163 URL: https://issues.apache.org/jira/browse/YARN-1163 Project: Hadoop YARN Issue Type: Sub-task Components: applications Reporter: Junping Du Assignee: Junping Du Priority: Minor Attachments: YARN-1163-v1.patch In RMContainerAllocator, AssignMapsWithLocality() is a very important method to assign map tasks on allocated containers with conforming different level of locality (dataLocal, rackLocal, etc.). However, this method messed with different code logic to handle different type of locality but have lots of similar behaviours. This is hard to maintain as well as do extension with other locality type, so we need some more clear code here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1163) Cleanup code for AssignMapsWithLocality() in RMContainerAllocator
[ https://issues.apache.org/jira/browse/YARN-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761941#comment-13761941 ] Hadoop QA commented on YARN-1163: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602151/YARN-1163-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1879//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1879//console This message is automatically generated. Cleanup code for AssignMapsWithLocality() in RMContainerAllocator - Key: YARN-1163 URL: https://issues.apache.org/jira/browse/YARN-1163 Project: Hadoop YARN Issue Type: Sub-task Components: applications Reporter: Junping Du Assignee: Junping Du Priority: Minor Attachments: YARN-1163-v1.patch In RMContainerAllocator, AssignMapsWithLocality() is a very important method to assign map tasks on allocated containers with conforming different level of locality (dataLocal, rackLocal, etc.). However, this method messed with different code logic to handle different type of locality but have lots of similar behaviours. This is hard to maintain as well as do extension with other locality type, so we need some more clear code here. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt
[ https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1152: - Attachment: YARN-1152-2.txt Thanks for the review, Vinod. bq. We had a UserGroupInformation.isSecurityEnabled() check before. Yeah, I pulled it out because Daryn routinely scolds me to remove them when I can. ;-) I'll put it back in. Hopefully we can get YARN-1108 completed soon so we can remove those checks for these tokens across the board. Attaching a patch that should address your review comments. Invalid key to HMAC computation error when getting application report for completed app attempt --- Key: YARN-1152 URL: https://issues.apache.org/jira/browse/YARN-1152 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-1152-2.txt, YARN-1152.txt On a secure cluster, an invalid key to HMAC error is thrown when trying to get an application report for an application with an attempt that has unregistered. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1098) Separate out RM services into Always On and Active
[ https://issues.apache.org/jira/browse/YARN-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762163#comment-13762163 ] Hadoop QA commented on YARN-1098: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602182/yarn-1098-5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1881//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1881//console This message is automatically generated. Separate out RM services into Always On and Active -- Key: YARN-1098 URL: https://issues.apache.org/jira/browse/YARN-1098 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1098-1.patch, yarn-1098-2.patch, yarn-1098-3.patch, yarn-1098-4.patch, yarn-1098-5.patch, yarn-1098-approach.patch, yarn-1098-approach.patch From discussion on YARN-1027, it makes sense to separate out services that are stateful and stateless. The stateless services can run perennially irrespective of whether the RM is in Active/Standby state, while the stateful services need to be started on transitionToActive() and completely shutdown on transitionToStandby(). The external-facing stateless services should respond to the client/AM/NM requests depending on whether the RM is Active/Standby. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1173) Run -shell_command echo Hello has empty stdout
Tassapol Athiapinya created YARN-1173: - Summary: Run -shell_command echo Hello has empty stdout Key: YARN-1173 URL: https://issues.apache.org/jira/browse/YARN-1173 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run: $ /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-*.jar -shell_command echo Hello Get logs with YARN logs: {panel} -bash-4.1$ yarn logs -applicationId application_1378424977532_0071 Container: container_1378424977532_0071_01_02 on myhost === LogType: stderr LogLength: 0 Log Contents: LogType: stdout LogLength: 1 Log Contents: {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762183#comment-13762183 ] Chris Trezzo commented on YARN-221: --- I have started looking at this and will hopefully have a patch in the next few days. Would someone mind adding me as a contributor so I can assign the JIRA to myself? Thanks! NM should provide a way for AM to tell it not to aggregate logs. Key: YARN-221 URL: https://issues.apache.org/jira/browse/YARN-221 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 0.23.4 Reporter: Robert Joseph Evans The NodeManager should provide a way for an AM to tell it that either the logs should not be aggregated, that they should be aggregated with a high priority, or that they should be aggregated but with a lower priority. The AM should be able to do this in the ContainerLaunch context to provide a default value, but should also be able to update the value when the container is released. This would allow for the NM to not aggregate logs in some cases, and avoid connection to the NN at all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-609) Fix synchronization issues in APIs which take in lists
[ https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1376#comment-1376 ] Hadoop QA commented on YARN-609: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602198/YARN-609.9.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1882//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1882//console This message is automatically generated. Fix synchronization issues in APIs which take in lists -- Key: YARN-609 URL: https://issues.apache.org/jira/browse/YARN-609 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-609.1.patch, YARN-609.2.patch, YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch Some of the APIs take in lists and the setter-APIs don't always do proper synchronization. We need to fix these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-305) Too many 'Node offerred to app:... messages in RM
[ https://issues.apache.org/jira/browse/YARN-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lohit Vijayarenu updated YARN-305: -- Attachment: YARN-305.2.patch Sorry, somehow missed review comments in email. These are the only log messages which seems to fill up RM output as of now. Too many 'Node offerred to app:... messages in RM -- Key: YARN-305 URL: https://issues.apache.org/jira/browse/YARN-305 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Lohit Vijayarenu Assignee: Lohit Vijayarenu Priority: Minor Attachments: YARN-305.1.patch, YARN-305.2.patch Running fair scheduler YARN shows that RM has lots of messages like the below. {noformat} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: Node offered to app: application_1357147147433_0002 reserved: false {noformat} They dont seem to tell much and same line is dumped many times in RM log. It would be good to have it improved with node information or moved to some other logging level with enough debug information -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1098) Separate out RM services into Always On and Active
[ https://issues.apache.org/jira/browse/YARN-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1098: --- Attachment: yarn-1098-5.patch Rebased on trunk. Mimicked the changes from YARN-1107, and verified TestRMRestart and TestDelegationTokenRenewer pass. Separate out RM services into Always On and Active -- Key: YARN-1098 URL: https://issues.apache.org/jira/browse/YARN-1098 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1098-1.patch, yarn-1098-2.patch, yarn-1098-3.patch, yarn-1098-4.patch, yarn-1098-5.patch, yarn-1098-approach.patch, yarn-1098-approach.patch From discussion on YARN-1027, it makes sense to separate out services that are stateful and stateless. The stateless services can run perennially irrespective of whether the RM is in Active/Standby state, while the stateful services need to be started on transitionToActive() and completely shutdown on transitionToStandby(). The external-facing stateless services should respond to the client/AM/NM requests depending on whether the RM is Active/Standby. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1098) Separate out RM services into Always On and Active
[ https://issues.apache.org/jira/browse/YARN-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762136#comment-13762136 ] Karthik Kambatla commented on YARN-1098: bq. Unrelated. Why are the secret manager not services themselves? If they were we wouldnt have to handle them separately in start and stop. Created YARN-1172 to address this. Separate out RM services into Always On and Active -- Key: YARN-1098 URL: https://issues.apache.org/jira/browse/YARN-1098 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Karthik Kambatla Assignee: Karthik Kambatla Labels: ha Attachments: yarn-1098-1.patch, yarn-1098-2.patch, yarn-1098-3.patch, yarn-1098-4.patch, yarn-1098-5.patch, yarn-1098-approach.patch, yarn-1098-approach.patch From discussion on YARN-1027, it makes sense to separate out services that are stateful and stateless. The stateless services can run perennially irrespective of whether the RM is in Active/Standby state, while the stateful services need to be started on transitionToActive() and completely shutdown on transitionToStandby(). The external-facing stateless services should respond to the client/AM/NM requests depending on whether the RM is Active/Standby. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated YARN-1166: Attachment: YARN-1166.patch I agree to the original request. I attached a patch for type 'counter'. With the patch, 'AppsFailed' is not decremented when it is resubmitted in subsequent attempts. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Attachments: YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1174) Accessing task page for running job throw 500 Error code
[ https://issues.apache.org/jira/browse/YARN-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762259#comment-13762259 ] Paul Han commented on YARN-1174: Patch is available. Will upload for review soon. Accessing task page for running job throw 500 Error code Key: YARN-1174 URL: https://issues.apache.org/jira/browse/YARN-1174 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.0.5-alpha Reporter: Paul Han For running jobs on Hadoop 2.0, trying to access Task counters page throws Server 500 error. Digging a bit I see this exception in MRAppMaster logs {noformat} 2013-08-09 21:54:35,083 ERROR [556661283@qtp-875702288-23] org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /mapreduce/task/task_1376081364308_0002_m_01 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:150) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:123) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1069) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected 5 at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:66) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:74) at org.apache.hadoop.yarn.webapp.View.render(View.java:233) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:47) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at
[jira] [Commented] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt
[ https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762310#comment-13762310 ] Vinod Kumar Vavilapalli commented on YARN-1152: --- +1, looks good. Checking this in. Invalid key to HMAC computation error when getting application report for completed app attempt --- Key: YARN-1152 URL: https://issues.apache.org/jira/browse/YARN-1152 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-1152-2.txt, YARN-1152.txt On a secure cluster, an invalid key to HMAC error is thrown when trying to get an application report for an application with an attempt that has unregistered. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-540) Race condition causing RM to potentially relaunch already unregistered AMs on RM restart
[ https://issues.apache.org/jira/browse/YARN-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762268#comment-13762268 ] Bikas Saha commented on YARN-540: - getIsUnregistered() instead of getUnregistered() ? Better grammar/more clear. e.g. is expected to retry until this flag becomes true. {code} Note: This flag only needed for RM recovery purpose. If RM recovery is + * enabled, user is expected to retry this flag until it becomes true. {code} Why is the default false? {code} + optional bool unregistered = 1 [default = false]; {code} Very large value of sleep. 100ms? The log should be before the sleep. {code} + while (true) { +FinishApplicationMasterResponse response = +rmClient.finishApplicationMaster(request); +if (response.getUnregistered()) { + break; +} +Thread.sleep(1000); +LOG.info(Waiting for application to be successfully unregistered.); {code} Instead of checking for an exact state I think it we should check for all terminal states of an RMApp. This will make the code more resilient to future changes in the state machines. So we check for FINISHING, FINISHED. FAILED, KILLED. This will also allow us to not special case the unmanaged AM in the latter half of the same function. Also, this is open to race conditions. e.g. someone kills the app before the app is removed from the store. We should probably make this an RMApp method like RMApp.isAppRemovedFromStore(). In this method we can either check the state or some boolean that we can set when the App_Removed event comes. {code} + // Application state has been removed from RMStateStore, if it's in + // FINISHING state + if (rmContext.getRMApps().get(applicationAttemptId.getApplicationId()) +.getState().equals(RMAppState.FINISHING)) { +return FinishApplicationMasterResponse.newInstance(true); + } {code} Is there a version of delete that will not fail if the file does not exist? OR we can have a boolean in RMApp to show that the removal request has already been sent and not send it multiple times. Lets try to avoid 2 remote HDFS calls in the common case. {code} +if(!fs.exists(deletePath)) + return; if(!fs.delete(deletePath, true)) { {code} Race condition causing RM to potentially relaunch already unregistered AMs on RM restart Key: YARN-540 URL: https://issues.apache.org/jira/browse/YARN-540 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Attachments: YARN-540.1.patch, YARN-540.2.patch, YARN-540.3.patch, YARN-540.4.patch, YARN-540.5.patch, YARN-540.patch, YARN-540.patch When job succeeds and successfully call finishApplicationMaster, RM shutdown and restart-dispatcher is stopped before it can process REMOVE_APP event. The next time RM comes back, it will reload the existing state files even though the job is succeeded -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.
[ https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated YARN-1025: Attachment: YARN-1025.1.patch I'm attaching a patch that updates yarn.cmd to set java.library.path, similar to how the shell script works. ResourceManager and NodeManager do not load native libraries on Windows. Key: YARN-1025 URL: https://issues.apache.org/jira/browse/YARN-1025 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Attachments: YARN-1025.1.patch ResourceManager and NodeManager do not have the correct setting for java.library.path when launched on Windows. This prevents the processes from loading native code from hadoop.dll. The native code is required for correct functioning on Windows (not optional), so this ultimately can cause failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1176) RM web services ClusterMetricsInfo total nodes doesn't include unhealthy nodes
Thomas Graves created YARN-1176: --- Summary: RM web services ClusterMetricsInfo total nodes doesn't include unhealthy nodes Key: YARN-1176 URL: https://issues.apache.org/jira/browse/YARN-1176 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.9, 3.0.0, 2.1.1-beta Reporter: Thomas Graves Priority: Critical In the web services api for the cluster/metrics, the totalNodes reported doesn't include the unhealthy nodes. this.totalNodes = activeNodes + lostNodes + decommissionedNodes + rebootedNodes; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1174) Accessing task page for running job throw 500 Error code
Paul Han created YARN-1174: -- Summary: Accessing task page for running job throw 500 Error code Key: YARN-1174 URL: https://issues.apache.org/jira/browse/YARN-1174 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.0.5-alpha Reporter: Paul Han For running jobs on Hadoop 2.0, trying to access Task counters page throws Server 500 error. Digging a bit I see this exception in MRAppMaster logs {noformat} 2013-08-09 21:54:35,083 ERROR [556661283@qtp-875702288-23] org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /mapreduce/task/task_1376081364308_0002_m_01 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:150) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter.doFilter(AmIpFilter.java:123) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1069) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected 5 at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:66) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:74) at org.apache.hadoop.yarn.webapp.View.render(View.java:233) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:47) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:843) at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:54) at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:80) at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:210) at
[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762337#comment-13762337 ] Vinod Kumar Vavilapalli commented on YARN-910: -- Okay, looks good. Checking this in. Allow auxiliary services to listen for container starts and completions --- Key: YARN-910 URL: https://issues.apache.org/jira/browse/YARN-910 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Alejandro Abdelnur Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, YARN-910.patch Making container start and completion events available to auxiliary services would allow them to be resource-aware. The auxiliary service would be able to notify a co-located service that is opportunistically using free capacity of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1177) Support automatic failover using ZKFC
Karthik Kambatla created YARN-1177: -- Summary: Support automatic failover using ZKFC Key: YARN-1177 URL: https://issues.apache.org/jira/browse/YARN-1177 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Prior to embedding leader election and failover controller in the RM (YARN-1029), it might be a good idea to use ZKFC for a first-cut automatic failover implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.
[ https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762344#comment-13762344 ] Hadoop QA commented on YARN-1025: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602212/YARN-1025.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1885//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1885//console This message is automatically generated. ResourceManager and NodeManager do not load native libraries on Windows. Key: YARN-1025 URL: https://issues.apache.org/jira/browse/YARN-1025 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Attachments: YARN-1025.1.patch ResourceManager and NodeManager do not have the correct setting for java.library.path when launched on Windows. This prevents the processes from loading native code from hadoop.dll. The native code is required for correct functioning on Windows (not optional), so this ultimately can cause failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1175) LogLength shown in $ yarn logs is 1 character longer than actual stdout
[ https://issues.apache.org/jira/browse/YARN-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762370#comment-13762370 ] Jason Lowe commented on YARN-1175: -- I believe there's a trailing newline character in the log which would bring the count to 87. LogLength shown in $ yarn logs is 1 character longer than actual stdout --- Key: YARN-1175 URL: https://issues.apache.org/jira/browse/YARN-1175 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run distributed shell with -shell_command pwd. Do $ yarn logs on that application. Count number of characters in Log Contents field. Number of characters will be smaller than LogLength field by one. {code:title=mock-up yarn logs output} $ /usr/bin/yarn logs -applicationId application_1378424977532_0088 ... LogType: stdout LogLength: 87 Log Contents: /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 {code} {panel} Th length of /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 is 86. {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1025) ResourceManager and NodeManager do not load native libraries on Windows.
[ https://issues.apache.org/jira/browse/YARN-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762369#comment-13762369 ] Chris Nauroth commented on YARN-1025: - {quote} -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {quote} There are no tests, because this is a change in a cmd script only. I tested manually by launching YARN daemons in a command prompt where the PATH did not include an explicit reference to the location of hadoop.dll. I verified that the daemon loaded the native code successfully. ResourceManager and NodeManager do not load native libraries on Windows. Key: YARN-1025 URL: https://issues.apache.org/jira/browse/YARN-1025 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 3.0.0, 2.1.1-beta Reporter: Chris Nauroth Attachments: YARN-1025.1.patch ResourceManager and NodeManager do not have the correct setting for java.library.path when launched on Windows. This prevents the processes from loading native code from hadoop.dll. The native code is required for correct functioning on Windows (not optional), so this ultimately can cause failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-305) Too many 'Node offerred to app:... messages in RM
[ https://issues.apache.org/jira/browse/YARN-305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762241#comment-13762241 ] Hadoop QA commented on YARN-305: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602204/YARN-305.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1883//console This message is automatically generated. Too many 'Node offerred to app:... messages in RM -- Key: YARN-305 URL: https://issues.apache.org/jira/browse/YARN-305 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Lohit Vijayarenu Assignee: Lohit Vijayarenu Priority: Minor Attachments: YARN-305.1.patch, YARN-305.2.patch Running fair scheduler YARN shows that RM has lots of messages like the below. {noformat} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable: Node offered to app: application_1357147147433_0002 reserved: false {noformat} They dont seem to tell much and same line is dumped many times in RM log. It would be good to have it improved with node information or moved to some other logging level with enough debug information -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1175) LogLength shown in $ yarn logs is 1 character longer than actual stdout
[ https://issues.apache.org/jira/browse/YARN-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762375#comment-13762375 ] Tassapol Athiapinya commented on YARN-1175: --- I understand that point. As an end user, it feels little bit weird. Looking at unix filesystem, when a file is empty, the file has 0 bytes in size. In this case, even if we do -shell_commands echo, LogLength would show up as 1. LogLength shown in $ yarn logs is 1 character longer than actual stdout --- Key: YARN-1175 URL: https://issues.apache.org/jira/browse/YARN-1175 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run distributed shell with -shell_command pwd. Do $ yarn logs on that application. Count number of characters in Log Contents field. Number of characters will be smaller than LogLength field by one. {code:title=mock-up yarn logs output} $ /usr/bin/yarn logs -applicationId application_1378424977532_0088 ... LogType: stdout LogLength: 87 Log Contents: /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 {code} {panel} Th length of /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 is 86. {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1175) LogLength shown in $ yarn logs is 1 character longer than actual stdout
[ https://issues.apache.org/jira/browse/YARN-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762379#comment-13762379 ] Tassapol Athiapinya commented on YARN-1175: --- Also, for distributed shell of running echo, AppMaster.stdout which does not print out anything, has LogLength of 0. There is inconsistency between the app master which does not call any print command and echo stdout which calls prints with (no end of line). LogLength shown in $ yarn logs is 1 character longer than actual stdout --- Key: YARN-1175 URL: https://issues.apache.org/jira/browse/YARN-1175 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run distributed shell with -shell_command pwd. Do $ yarn logs on that application. Count number of characters in Log Contents field. Number of characters will be smaller than LogLength field by one. {code:title=mock-up yarn logs output} $ /usr/bin/yarn logs -applicationId application_1378424977532_0088 ... LogType: stdout LogLength: 87 Log Contents: /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 {code} {panel} Th length of /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 is 86. {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762356#comment-13762356 ] Alejandro Abdelnur commented on YARN-910: - [~vinodkv], thanks. Any reason not have this in for 2.1.1-beta? Allow auxiliary services to listen for container starts and completions --- Key: YARN-910 URL: https://issues.apache.org/jira/browse/YARN-910 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Alejandro Abdelnur Fix For: 2.3.0 Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, YARN-910.patch Making container start and completion events available to auxiliary services would allow them to be resource-aware. The auxiliary service would be able to notify a co-located service that is opportunistically using free capacity of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-910) Allow auxiliary services to listen for container starts and completions
[ https://issues.apache.org/jira/browse/YARN-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762359#comment-13762359 ] Hudson commented on YARN-910: - SUCCESS: Integrated in Hadoop-trunk-Commit #4391 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4391/]) YARN-910. Augmented auxiliary services to listen for container starts and completions in addition to application events. Contributed by Alejandro Abdelnur. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521298) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/AuxiliaryService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerInitializationContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ContainerTerminationContext.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServicesEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java Allow auxiliary services to listen for container starts and completions --- Key: YARN-910 URL: https://issues.apache.org/jira/browse/YARN-910 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Sandy Ryza Assignee: Alejandro Abdelnur Fix For: 2.3.0 Attachments: YARN-910.patch, YARN-910.patch, YARN-910.patch, YARN-910.patch Making container start and completion events available to auxiliary services would allow them to be resource-aware. The auxiliary service would be able to notify a co-located service that is opportunistically using free capacity of allocation changes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1175) LogLength shown in $ yarn logs is 1 character longer than actual stdout
[ https://issues.apache.org/jira/browse/YARN-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762388#comment-13762388 ] Jason Lowe commented on YARN-1175: -- It is consistent with the contents of the log. If nothing is ever printed to a log file then the log file has zero bytes in it and the log length is correctly reported as zero. If a single {{echo}} is used then the log contents are correctly reported as 1 since {{echo}} will output a single newline character when given no additional arguments. This is analogous to this situation with a regular shell session: {noformat} $ /tmp/x $ echo /tmp/y $ ls -l /tmp/[xy] -rw-r--r-- 1 user group 0 Sep 9 22:22 /tmp/x -rw-r--r-- 1 user group 1 Sep 9 22:22 /tmp/y {noformat} Note that /tmp/x has a size of 0 and /tmp/y has a size of 1. LogLength is simply reporting the number of bytes in the logfile, and not all of those bytes are visible characters. LogLength shown in $ yarn logs is 1 character longer than actual stdout --- Key: YARN-1175 URL: https://issues.apache.org/jira/browse/YARN-1175 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run distributed shell with -shell_command pwd. Do $ yarn logs on that application. Count number of characters in Log Contents field. Number of characters will be smaller than LogLength field by one. {code:title=mock-up yarn logs output} $ /usr/bin/yarn logs -applicationId application_1378424977532_0088 ... LogType: stdout LogLength: 87 Log Contents: /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 {code} {panel} Th length of /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 is 86. {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-1175) LogLength shown in $ yarn logs is 1 character longer than actual stdout
[ https://issues.apache.org/jira/browse/YARN-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tassapol Athiapinya resolved YARN-1175. --- Resolution: Not A Problem It works as intended. LogLength shown in $ yarn logs is 1 character longer than actual stdout --- Key: YARN-1175 URL: https://issues.apache.org/jira/browse/YARN-1175 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run distributed shell with -shell_command pwd. Do $ yarn logs on that application. Count number of characters in Log Contents field. Number of characters will be smaller than LogLength field by one. {code:title=mock-up yarn logs output} $ /usr/bin/yarn logs -applicationId application_1378424977532_0088 ... LogType: stdout LogLength: 87 Log Contents: /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 {code} {panel} Th length of /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 is 86. {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1175) LogLength shown in $ yarn logs is 1 character longer than actual stdout
[ https://issues.apache.org/jira/browse/YARN-1175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762391#comment-13762391 ] Tassapol Athiapinya commented on YARN-1175: --- Thank you [~jlowe] for clarification. I will close this issue. LogLength shown in $ yarn logs is 1 character longer than actual stdout --- Key: YARN-1175 URL: https://issues.apache.org/jira/browse/YARN-1175 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run distributed shell with -shell_command pwd. Do $ yarn logs on that application. Count number of characters in Log Contents field. Number of characters will be smaller than LogLength field by one. {code:title=mock-up yarn logs output} $ /usr/bin/yarn logs -applicationId application_1378424977532_0088 ... LogType: stdout LogLength: 87 Log Contents: /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 {code} {panel} Th length of /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 is 86. {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1152) Invalid key to HMAC computation error when getting application report for completed app attempt
[ https://issues.apache.org/jira/browse/YARN-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762323#comment-13762323 ] Hudson commented on YARN-1152: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4390 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4390/]) YARN-1152. Fixed a bug in ResourceManager that was causing clients to get invalid client token key errors when an appliation is about to finish. Contributed by Jason Lowe. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521292) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttempt.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java Invalid key to HMAC computation error when getting application report for completed app attempt --- Key: YARN-1152 URL: https://issues.apache.org/jira/browse/YARN-1152 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.1-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: YARN-1152-2.txt, YARN-1152.txt On a secure cluster, an invalid key to HMAC error is thrown when trying to get an application report for an application with an attempt that has unregistered. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-609) Fix synchronization issues in APIs which take in lists
[ https://issues.apache.org/jira/browse/YARN-609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-609: --- Attachment: YARN-609.9.patch Fix synchronization issues in APIs which take in lists -- Key: YARN-609 URL: https://issues.apache.org/jira/browse/YARN-609 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-609.1.patch, YARN-609.2.patch, YARN-609.3.patch, YARN-609.4.patch, YARN-609.5.patch, YARN-609.6.patch, YARN-609.7.patch, YARN-609.8.patch, YARN-609.9.patch Some of the APIs take in lists and the setter-APIs don't always do proper synchronization. We need to fix these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762331#comment-13762331 ] Hadoop QA commented on YARN-1166: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602207/YARN-1166.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestLeafQueue {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1884//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1884//console This message is automatically generated. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Akira AJISAKA Attachments: YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-649) Make container logs available over HTTP in plain text
[ https://issues.apache.org/jira/browse/YARN-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762411#comment-13762411 ] Jonathan Eagles commented on YARN-649: -- [~sandyr] TestContainerLogsPage#testContainerLogPageAccess (native-only test) is failing after this check in. I have created YARN-1178 to address this test failure mvn clean test -Pnative -Dtest=TestContainerLogsPage Make container logs available over HTTP in plain text - Key: YARN-649 URL: https://issues.apache.org/jira/browse/YARN-649 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.3.0 Attachments: YARN-649-2.patch, YARN-649-3.patch, YARN-649-4.patch, YARN-649-5.patch, YARN-649-6.patch, YARN-649-7.patch, YARN-649.patch, YARN-752-1.patch It would be good to make container logs available over the REST API for MAPREDUCE-4362 and so that they can be accessed programatically in general. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1178) TestContainerLogsPage#testContainerLogPageAccess is failing
Jonathan Eagles created YARN-1178: - Summary: TestContainerLogsPage#testContainerLogPageAccess is failing Key: YARN-1178 URL: https://issues.apache.org/jira/browse/YARN-1178 Project: Hadoop YARN Issue Type: Bug Reporter: Jonathan Eagles Test is failing after YARN-649. This test is only run in native mode mvn clean test -Pnative -Dtest=TestContainerLogsPage -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1175) LogLength shown in $ yarn logs is 1 character longer than actual stdout
Tassapol Athiapinya created YARN-1175: - Summary: LogLength shown in $ yarn logs is 1 character longer than actual stdout Key: YARN-1175 URL: https://issues.apache.org/jira/browse/YARN-1175 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run distributed shell with -shell_command pwd. Do $ yarn logs on that application. Count number of characters in Log Contents field. Number of characters will be smaller than LogLength field by one. {code:title=mock-up yarn logs output} $ /usr/bin/yarn logs -applicationId application_1378424977532_0088 ... LogType: stdout LogLength: 87 Log Contents: /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 {code} {panel} Th length of /mypath/appcache/application_1378424977532_0088/container_1378424977532_0088_01_02 is 86. {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1165) Move init() of activeServices to ResourceManager#serviceStart()
[ https://issues.apache.org/jira/browse/YARN-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762550#comment-13762550 ] Karthik Kambatla commented on YARN-1165: I figured a way around this. Details: # RMHAProtocolService is always added (addService()) to the RM # RMHAProtocolService#serviceInit() - creates and init()s RMActiveServices (activeServices) # RMHAProtocolService#serviceStart() - transitions to active/standby depending on whether HA is enabled or not # RMHAProtocolService#transitionToActive() - starts activeServices # RMHAProtocolService#transitionToStandby() - stops activeServices, sets it to null, creates a new instance of RMActiveServices and init()s it. So, when in the INITIALIZING and STANDBY HAServiceStates, activeServices is initialized. When in the ACTIVE HAServiceState, activeServices is started. I ll include this in the next patch on YARN-1027 shortly. Will close this as a duplicate meanwhile. Move init() of activeServices to ResourceManager#serviceStart() --- Key: YARN-1165 URL: https://issues.apache.org/jira/browse/YARN-1165 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: test-failures.pdf Background: # YARN-1098 separates out RM services into Always-On and Active services, but doesn't change the behavior in any way. # For YARN-1027, we would want to create, initialize, and start RMActiveServices in the scope of RM#serviceStart(). This requires updating test cases that check for certain behavior post RM#serviceInit() - otherwise, most of these tests NPE. Creating a JIRA different from YARN-1027 to address all these test cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (YARN-1165) Move init() of activeServices to ResourceManager#serviceStart()
[ https://issues.apache.org/jira/browse/YARN-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla reopened YARN-1165: Move init() of activeServices to ResourceManager#serviceStart() --- Key: YARN-1165 URL: https://issues.apache.org/jira/browse/YARN-1165 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: test-failures.pdf Background: # YARN-1098 separates out RM services into Always-On and Active services, but doesn't change the behavior in any way. # For YARN-1027, we would want to create, initialize, and start RMActiveServices in the scope of RM#serviceStart(). This requires updating test cases that check for certain behavior post RM#serviceInit() - otherwise, most of these tests NPE. Creating a JIRA different from YARN-1027 to address all these test cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-1165) Move init() of activeServices to ResourceManager#serviceStart()
[ https://issues.apache.org/jira/browse/YARN-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla resolved YARN-1165. Resolution: Not A Problem My bad. We should close this as Not A Problem as we are not init()ing activeServices in RM#serviceStart() any more. Move init() of activeServices to ResourceManager#serviceStart() --- Key: YARN-1165 URL: https://issues.apache.org/jira/browse/YARN-1165 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: test-failures.pdf Background: # YARN-1098 separates out RM services into Always-On and Active services, but doesn't change the behavior in any way. # For YARN-1027, we would want to create, initialize, and start RMActiveServices in the scope of RM#serviceStart(). This requires updating test cases that check for certain behavior post RM#serviceInit() - otherwise, most of these tests NPE. Creating a JIRA different from YARN-1027 to address all these test cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1027: --- Attachment: yarn-1027-5.patch Thanks Bikas. I am uploading a patch that addresses all your comments. As commented on YARN-1165, figured out a way to make sure the RMActiveServices are always inited so the tests pass. Yet to add the configs to yarn-site and test if the previous instances are being GC'ed on a pseudo-dist cluster. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762557#comment-13762557 ] Karthik Kambatla commented on YARN-1027: bq. What happens if we call this method when the RM is in standby mode? I am wondering if we may be able to call this during that time and verify that the RM is indeed not active. These particular MockRM methods work on any inited RM - even standby mode. The tests for the Standby mode should be on a MiniYARNCluster. Will try to work those in. Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Attachments: test-yarn-1027.patch, yarn-1027-1.patch, yarn-1027-2.patch, yarn-1027-3.patch, yarn-1027-4.patch, yarn-1027-5.patch, yarn-1027-including-yarn-1098-3.patch, yarn-1027-in-rm-poc.patch Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt
[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762565#comment-13762565 ] Vinod Kumar Vavilapalli commented on YARN-292: -- Actually, I am able to reproduce failures with TestFifoScheduler consistently. +1, the patch looks good. Checking this in. ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt Key: YARN-292 URL: https://issues.apache.org/jira/browse/YARN-292 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Zhijie Shen Attachments: ArrayIndexOutOfBoundsException.log, YARN-292.1.patch, YARN-292.2.patch, YARN-292.3.patch, YARN-292.4.patch {code:xml} 2012-12-26 08:41:15,030 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Calling allocate on removed or non existant application appattempt_1356385141279_49525_01 2012-12-26 08:41:15,031 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1356385141279_49525 java.lang.ArrayIndexOutOfBoundsException: 0 at java.util.Arrays$ArrayList.get(Arrays.java:3381) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-540) Race condition causing RM to potentially relaunch already unregistered AMs on RM restart
[ https://issues.apache.org/jira/browse/YARN-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762581#comment-13762581 ] Hadoop QA commented on YARN-540: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602243/YARN-540.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/1886//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/1886//console This message is automatically generated. Race condition causing RM to potentially relaunch already unregistered AMs on RM restart Key: YARN-540 URL: https://issues.apache.org/jira/browse/YARN-540 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Attachments: YARN-540.1.patch, YARN-540.2.patch, YARN-540.3.patch, YARN-540.4.patch, YARN-540.5.patch, YARN-540.6.patch, YARN-540.patch, YARN-540.patch When job succeeds and successfully call finishApplicationMaster, RM shutdown and restart-dispatcher is stopped before it can process REMOVE_APP event. The next time RM comes back, it will reload the existing state files even though the job is succeeded -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-1173) Run -shell_command echo Hello has empty stdout
[ https://issues.apache.org/jira/browse/YARN-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tassapol Athiapinya resolved YARN-1173. --- Resolution: Invalid It is invalid usage. Distributed shell user has to separate between shell command and shell argument. To do echo Hello, the command has to be: $ /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-*.jar -shell_command echo -shell_args hello Run -shell_command echo Hello has empty stdout Key: YARN-1173 URL: https://issues.apache.org/jira/browse/YARN-1173 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Fix For: 2.1.1-beta Run: $ /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-*.jar -shell_command echo Hello Get logs with YARN logs: {panel} -bash-4.1$ yarn logs -applicationId application_1378424977532_0071 Container: container_1378424977532_0071_01_02 on myhost === LogType: stderr LogLength: 0 Log Contents: LogType: stdout LogLength: 1 Log Contents: {panel} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1168) Cannot run echo \Hello World\
[ https://issues.apache.org/jira/browse/YARN-1168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tassapol Athiapinya updated YARN-1168: -- Description: Run $ ssh localhost echo \Hello World\ with bash does succeed. Hello World is shown in stdout. Run distributed shell with similar echo command. That is either $ /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.*.jar -shell_command echo -shell_args \Hello World\ or $ /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.*.jar -shell_command echo -shell_args Hello World {code:title=yarn logs -- only hello is shown} LogType: stdout LogLength: 6 Log Contents: hello {code} was: Run $ ssh localhost echo \Hello World\ with bash does succeed. Hello World is shown in stdout. Run distributed shell with similar echo command. That is $ /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.*.jar -shell_command echo \Hello World\ {code:title=partial console logs} distributedshell.Client: Completed setting up app master command $JAVA_HOME/bin/java -Xmx10m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 10 --num_containers 1 --priority 0 --shell_command echo Hello World 1LOG_DIR/AppMaster.stdout 2LOG_DIR/AppMaster.stderr ... line 28: syntax error: unexpected end of file at org.apache.hadoop.util.Shell.runCommand(Shell.java:458) at org.apache.hadoop.util.Shell.run(Shell.java:373) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:578) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:258) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:74) ... distributedshell.Client: Application failed to complete successfully {code} Cannot run echo \Hello World\ - Key: YARN-1168 URL: https://issues.apache.org/jira/browse/YARN-1168 Project: Hadoop YARN Issue Type: Bug Components: applications/distributed-shell Reporter: Tassapol Athiapinya Priority: Critical Fix For: 2.1.1-beta Run $ ssh localhost echo \Hello World\ with bash does succeed. Hello World is shown in stdout. Run distributed shell with similar echo command. That is either $ /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.*.jar -shell_command echo -shell_args \Hello World\ or $ /usr/bin/yarn org.apache.hadoop.yarn.applications.distributedshell.Client -jar /usr/lib/hadoop-yarn/hadoop-yarn-applications-distributedshell-2.*.jar -shell_command echo -shell_args Hello World {code:title=yarn logs -- only hello is shown} LogType: stdout LogLength: 6 Log Contents: hello {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt
[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762591#comment-13762591 ] Hudson commented on YARN-292: - SUCCESS: Integrated in Hadoop-trunk-Commit #4392 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4392/]) YARN-292. Fixed FifoScheduler and FairScheduler to make their applications data structures thread safe to avoid RM crashing with ArrayIndexOutOfBoundsException. Contributed by Zhijie Shen. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521328) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt Key: YARN-292 URL: https://issues.apache.org/jira/browse/YARN-292 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Zhijie Shen Fix For: 2.1.1-beta Attachments: ArrayIndexOutOfBoundsException.log, YARN-292.1.patch, YARN-292.2.patch, YARN-292.3.patch, YARN-292.4.patch {code:xml} 2012-12-26 08:41:15,030 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Calling allocate on removed or non existant application appattempt_1356385141279_49525_01 2012-12-26 08:41:15,031 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1356385141279_49525 java.lang.ArrayIndexOutOfBoundsException: 0 at java.util.Arrays$ArrayList.get(Arrays.java:3381) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent
[jira] [Commented] (YARN-540) Race condition causing RM to potentially relaunch already unregistered AMs on RM restart
[ https://issues.apache.org/jira/browse/YARN-540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762599#comment-13762599 ] Bikas Saha commented on YARN-540: - bq. Exception because unManagedAM attempt will be immediately removed from the responseMap Havent looked at the patch yet, but this sounds like a race condition waiting to happen in other cases. Lets say the first unregister returns false. Now someone kills the app and the app goes through the transition that removes it from the responseMap. Now if the AM comes back with the second unregister, should it fail or succeed. The key question here is whether an AM is done after it calls unregister. If the unregister fails, then is the AM expected to considered failing itself or continuing as if it has succeeded? Race condition causing RM to potentially relaunch already unregistered AMs on RM restart Key: YARN-540 URL: https://issues.apache.org/jira/browse/YARN-540 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Attachments: YARN-540.1.patch, YARN-540.2.patch, YARN-540.3.patch, YARN-540.4.patch, YARN-540.5.patch, YARN-540.6.patch, YARN-540.patch, YARN-540.patch When job succeeds and successfully call finishApplicationMaster, RM shutdown and restart-dispatcher is stopped before it can process REMOVE_APP event. The next time RM comes back, it will reload the existing state files even though the job is succeeded -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-921) Consider adding clone methods to protocol objects
[ https://issues.apache.org/jira/browse/YARN-921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du reassigned YARN-921: --- Assignee: Junping Du Consider adding clone methods to protocol objects - Key: YARN-921 URL: https://issues.apache.org/jira/browse/YARN-921 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Junping Du Whenever we create a new object from an existing object, a clone method could be used to create a copy efficiently vs. new object followed by setters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-921) Consider adding clone methods to protocol objects
[ https://issues.apache.org/jira/browse/YARN-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762610#comment-13762610 ] Bikas Saha commented on YARN-921: - Before we proceed to invest effort in this jira, it may make sense to get a feeling from other committers/contributors whether they want this or not. An email to dev would be good to solicit -ve opinions, if any. Consider adding clone methods to protocol objects - Key: YARN-921 URL: https://issues.apache.org/jira/browse/YARN-921 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Junping Du Whenever we create a new object from an existing object, a clone method could be used to create a copy efficiently vs. new object followed by setters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-921) Consider adding clone methods to protocol objects
[ https://issues.apache.org/jira/browse/YARN-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762661#comment-13762661 ] Junping Du commented on YARN-921: - Hi Bikas, Thanks for your good suggestions here. I would follow up a email thread to dev and get more people's feedback. Before that, I want to make sure I understand the requirement for clone in proto objects. First, I didn't find a perfect example to clone a proto in the same class type. What I found are get info from one proto and set to another proto. Am I missing something? Shouldn't semantics of clone be something like this? {code} x.clone() != x x.clone().getClass() == x.getClass() x.clone().equals(x) == true {code} Consider adding clone methods to protocol objects - Key: YARN-921 URL: https://issues.apache.org/jira/browse/YARN-921 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Junping Du Whenever we create a new object from an existing object, a clone method could be used to create a copy efficiently vs. new object followed by setters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-291) [Umbrella] Dynamic resource configuration
[ https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-291: Summary: [Umbrella] Dynamic resource configuration (was: Dynamic resource configuration) [Umbrella] Dynamic resource configuration - Key: YARN-291 URL: https://issues.apache.org/jira/browse/YARN-291 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Labels: features Attachments: Elastic Resources for YARN-v0.2.pdf, YARN-291-AddClientRMProtocolToSetNodeResource-03.patch, YARN-291-all-v1.patch, YARN-291-CoreAndAdmin.patch, YARN-291-core-HeartBeatAndScheduler-01.patch, YARN-291-JMXInterfaceOnNM-02.patch, YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch, YARN-291-YARNClientCommandline-04.patch The current Hadoop YARN resource management logic assumes per node resource is static during the lifetime of the NM process. Allowing run-time configuration on per node resource will give us finer granularity of resource elasticity. This allows Hadoop workloads to coexist with other workloads on the same hardware efficiently, whether or not the environment is virtualized. More background and design details can be found in attached proposal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-921) Consider adding clone methods to protocol objects
[ https://issues.apache.org/jira/browse/YARN-921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762756#comment-13762756 ] Bikas Saha commented on YARN-921: - The requirement is to be able to clone a yarn protocol object (not a protobuf object). A yarn object is implemented as a protobuf object but that is not visible to the user. So yarn helper API exists eg. Container.newInstance() returns a container object. However, given a container object there is no helper API to create a copy (like a copy constructor in C++). The user has to manually get and set the members. The set methods are mostly @private and so the user is technically not supposed to set any member. So having a .clone method lets the user easily create a copy of an existing object. Consider adding clone methods to protocol objects - Key: YARN-921 URL: https://issues.apache.org/jira/browse/YARN-921 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Junping Du Whenever we create a new object from an existing object, a clone method could be used to create a copy efficiently vs. new object followed by setters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1042) add ability to specify affinity/anti-affinity in container requests
[ https://issues.apache.org/jira/browse/YARN-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1042: - Attachment: YARN-1042-demo.patch add ability to specify affinity/anti-affinity in container requests --- Key: YARN-1042 URL: https://issues.apache.org/jira/browse/YARN-1042 Project: Hadoop YARN Issue Type: New Feature Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Assignee: Junping Du Attachments: YARN-1042-demo.patch container requests to the AM should be able to request anti-affinity to ensure that things like Region Servers don't come up on the same failure zones. Similarly, you may be able to want to specify affinity to same host or rack without specifying which specific host/rack. Example: bringing up a small giraph cluster in a large YARN cluster would benefit from having the processes in the same rack purely for bandwidth reasons. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira