[jira] [Commented] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh
[ https://issues.apache.org/jira/browse/YARN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261997#comment-14261997 ] Rohith commented on YARN-3000: -- Or shall I take over the issue? > YARN_PID_DIR should be visible in yarn-env.sh > - > > Key: YARN-3000 > URL: https://issues.apache.org/jira/browse/YARN-3000 > Project: Hadoop YARN > Issue Type: Bug > Components: scripts >Affects Versions: 2.6.0 >Reporter: Jeff Zhang >Priority: Minor > > Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed > not the place for user to set up enviroment variable. IMO, yarn-env.sh is the > place for users to set up enviroment variable just like hadoop-env.sh, so > it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment > just like YARN_RESOURCEMANAGER_HEAPSIZE ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh
[ https://issues.apache.org/jira/browse/YARN-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261996#comment-14261996 ] Rohith commented on YARN-3000: -- Thanks for reporting issue. Would like to work on this? > YARN_PID_DIR should be visible in yarn-env.sh > - > > Key: YARN-3000 > URL: https://issues.apache.org/jira/browse/YARN-3000 > Project: Hadoop YARN > Issue Type: Bug > Components: scripts >Affects Versions: 2.6.0 >Reporter: Jeff Zhang >Priority: Minor > > Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed > not the place for user to set up enviroment variable. IMO, yarn-env.sh is the > place for users to set up enviroment variable just like hadoop-env.sh, so > it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment > just like YARN_RESOURCEMANAGER_HEAPSIZE ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3000) YARN_PID_DIR should be visible in yarn-env.sh
Jeff Zhang created YARN-3000: Summary: YARN_PID_DIR should be visible in yarn-env.sh Key: YARN-3000 URL: https://issues.apache.org/jira/browse/YARN-3000 Project: Hadoop YARN Issue Type: Bug Components: scripts Affects Versions: 2.6.0 Reporter: Jeff Zhang Priority: Minor Currently I see YARN_PID_DIR only show in yarn-deamon.sh which is supposed not the place for user to set up enviroment variable. IMO, yarn-env.sh is the place for users to set up enviroment variable just like hadoop-env.sh, so it's better to put YARN_PID_DIR into yarn-env.sh. ( can put it into comment just like YARN_RESOURCEMANAGER_HEAPSIZE ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2978) ResourceManager crashes with NPE while getting queue info
[ https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261983#comment-14261983 ] Varun Saxena commented on YARN-2978: This issue is due to lack of synchronization. NPE happens in following piece of code while calling the line {noformat}getApplications(i).isInitialized(){noformat}. getApplications(int i) gets object at index i from the underlying applications list. This means List#get returns null object from the index location specified. {code:title=YarnProtos.java} public final boolean isInitialized() { for (int i = 0; i < getApplicationsCount(); i++) { if (!getApplications(i).isInitialized()) { memoizedIsInitialized = 0; return false; } } } {code} This is because multiple threads can access the same QueueInfo object in case of Capacity scheduler and this can lead to underlying array list which stores applications being modified by different threads, which in turn can lead to NPE. YARN-2979 is also due to lack of synchronization > ResourceManager crashes with NPE while getting queue info > - > > Key: YARN-2978 > URL: https://issues.apache.org/jira/browse/YARN-2978 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Varun Saxena > > java.lang.NullPointerException > at > org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625) > at > org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939) > at > org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290) > at > org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261978#comment-14261978 ] Rohith commented on YARN-2922: -- Test failure is unrelated to patch, it is passing locally with the patch > Concurrent Modification Exception in LeafQueue when collecting applications > --- > > Key: YARN-2922 > URL: https://issues.apache.org/jira/browse/YARN-2922 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Rohith > Attachments: 0001-YARN-2922.patch, 0001-YARN-2922.patch > > > java.util.ConcurrentModificationException > at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261968#comment-14261968 ] Hadoop QA commented on YARN-2881: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689635/YARN-2881.005.patch against trunk revision e7257ac. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6224//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6224//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6224//console This message is automatically generated. > Implement PlanFollower for FairScheduler > > > Key: YARN-2881 > URL: https://issues.apache.org/jira/browse/YARN-2881 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2881.001.patch, YARN-2881.002.patch, > YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, > YARN-2881.005.patch, YARN-2881.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261935#comment-14261935 ] Hadoop QA commented on YARN-2922: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689633/0001-YARN-2922.patch against trunk revision e2351c7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6222//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6222//console This message is automatically generated. > Concurrent Modification Exception in LeafQueue when collecting applications > --- > > Key: YARN-2922 > URL: https://issues.apache.org/jira/browse/YARN-2922 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Rohith > Attachments: 0001-YARN-2922.patch, 0001-YARN-2922.patch > > > java.util.ConcurrentModificationException > at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2881: Attachment: YARN-2881.005.patch Updated for the renamed methods in YARN-2998 > Implement PlanFollower for FairScheduler > > > Key: YARN-2881 > URL: https://issues.apache.org/jira/browse/YARN-2881 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2881.001.patch, YARN-2881.002.patch, > YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, > YARN-2881.005.patch, YARN-2881.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261894#comment-14261894 ] Hadoop QA commented on YARN-2881: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689622/YARN-2881.004.patch against trunk revision e7257ac. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6223//console This message is automatically generated. > Implement PlanFollower for FairScheduler > > > Key: YARN-2881 > URL: https://issues.apache.org/jira/browse/YARN-2881 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2881.001.patch, YARN-2881.002.patch, > YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, > YARN-2881.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2814) RM: Clean-up the handling of "fatal" events
[ https://issues.apache.org/jira/browse/YARN-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261893#comment-14261893 ] Rohith commented on YARN-2814: -- The patch looks good overall, leaving 2nd point one comment # Can RMContext passed to RMStateStore and RM instance can be taken from context? I think point-3 was intended to this > RM: Clean-up the handling of "fatal" events > --- > > Key: YARN-2814 > URL: https://issues.apache.org/jira/browse/YARN-2814 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-2814-0.patch > > > YARN-2579 fixes a critical issue around handling fatal events in the RM, but > does so minimally. This JIRA is to follow through that approach and do more > clean-up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261878#comment-14261878 ] Hudson commented on YARN-2998: -- FAILURE: Integrated in Hadoop-trunk-Commit #6799 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6799/]) YARN-2998. Abstract out scheduler independent PlanFollower components. (Anubhav Dhoot via kasha) (kasha: rev e7257acd8a7adb74d81cd1d009d4a99f023ed844) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacitySchedulerPlanFollower.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacitySchedulerPlanFollower.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/YarnScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSchedulerPlanFollowerBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractSchedulerPlanFollower.java * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Fix For: 2.7.0 > > Attachments: YARN-2998.001.patch, YARN-2998.003.patch, > yarn-2998-2.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261870#comment-14261870 ] Rohith commented on YARN-2922: -- You are right, it is possible to ocure ConcurrentModificatinException since {{LeafQueue#getTotalResourcePending}} is public interface. Updated the patch for the same, kindly review > Concurrent Modification Exception in LeafQueue when collecting applications > --- > > Key: YARN-2922 > URL: https://issues.apache.org/jira/browse/YARN-2922 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Rohith > Attachments: 0001-YARN-2922.patch, 0001-YARN-2922.patch > > > java.util.ConcurrentModificationException > at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261869#comment-14261869 ] Karthik Kambatla commented on YARN-2998: +1. Checking this in. > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch, YARN-2998.003.patch, > yarn-2998-2.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2922: - Attachment: 0001-YARN-2922.patch > Concurrent Modification Exception in LeafQueue when collecting applications > --- > > Key: YARN-2922 > URL: https://issues.apache.org/jira/browse/YARN-2922 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Rohith > Attachments: 0001-YARN-2922.patch, 0001-YARN-2922.patch > > > java.util.ConcurrentModificationException > at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2560) Diagnostics is delayed to passed to ApplicationReport
[ https://issues.apache.org/jira/browse/YARN-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261855#comment-14261855 ] Jeff Zhang commented on YARN-2560: -- Check RMAppImpl & RMAppAttemptImpl and found that FinalApplicationStatus would been retrieved from the currentAttempt, while diagnostics would been retrieved from the RMApp which would be updated until it gets the AttemptFinishedEvent. This is the root cause that make the FinalApplicationStatus and diagnostics inconsistent sometimes. > Diagnostics is delayed to passed to ApplicationReport > - > > Key: YARN-2560 > URL: https://issues.apache.org/jira/browse/YARN-2560 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jeff Zhang > > The diagnostics of Application may be delayed to pass to ApplicationReport. > Here's one example when ApplicationStatus has changed to FAILED, but the > diagnostics is still empty. And the next call of getApplicationReport could > get the diagnostics. > {code} > while(true) { > appReport = yarnClient.getApplicationReport(appId); > Thread.sleep(1000); > LOG.info("AppStatus:" + appReport.getFinalApplicationStatus()); > LOG.info("Diagnostics:" + appReport.getDiagnostics()); > > } > {code} > *Output:* > {code} > AppStatus:FAILED > Diagnostics: // empty > // get diagnostics for the next getApplicationReport > AppStatus:FAILED > Diagnostics: // diagnostics info here > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261833#comment-14261833 ] Hadoop QA commented on YARN-2998: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689607/YARN-2998.003.patch against trunk revision e2351c7. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6221//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6221//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6221//console This message is automatically generated. > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch, YARN-2998.003.patch, > yarn-2998-2.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261825#comment-14261825 ] Chengbing Liu commented on YARN-2997: - [~jianhe] The containers are not removed in this patch, they are just not reported to RM when the following conditions are met: * The application is not finished * The container was completed and was already in {{recentlyStoppedContainers}} * It is a normal heartbeat with RM, not after RM restart Note that the container is not removed from the NM context. In a resync with RM, these completed applications will still be reported for work-preserving recovery. > NM keeps sending finished containers to RM until app is finished > > > Key: YARN-2997 > URL: https://issues.apache.org/jira/browse/YARN-2997 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Chengbing Liu > Attachments: YARN-2997.patch > > > We have seen in RM log a lot of > {quote} > INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: > Null container completed... > {quote} > It is caused by NM sending completed containers repeatedly until the app is > finished. On the RM side, the container is already released, hence > {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2999) Compilation error in AllocationConfiguration.java in java1.7 while running tests
[ https://issues.apache.org/jira/browse/YARN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261814#comment-14261814 ] Hadoop QA commented on YARN-2999: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689606/0001-YARN-2999.patch against trunk revision b7442bf. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6220//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6220//console This message is automatically generated. > Compilation error in AllocationConfiguration.java in java1.7 while running > tests > > > Key: YARN-2999 > URL: https://issues.apache.org/jira/browse/YARN-2999 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith >Assignee: Rohith > Attachments: 0001-YARN-2999.patch > > > In AllocationConfiguration, in the below object creation, generic type must > be specified as instance variable,otherwise java1.7 lead compilation error > while running tests for RM and NM > {{reservableQueues = new HashSet<>();}} > Report : > {code} > java.lang.Error: Unresolved compilation problem: > '<>' operator is not allowed for source level below 1.7 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.(AllocationConfiguration.java:150) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1276) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1320) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:559) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:985) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:251) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2027) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.(TestRMRestart.java:2020) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testAppAttemptTokensRestoredOnRMRestart(TestRMRestart.java:1199) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2881: Attachment: YARN-2881.004.patch Moved code factored from YARN-2998 > Implement PlanFollower for FairScheduler > > > Key: YARN-2881 > URL: https://issues.apache.org/jira/browse/YARN-2881 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2881.001.patch, YARN-2881.002.patch, > YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.004.patch, > YARN-2881.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261797#comment-14261797 ] Jian He commented on YARN-2997: --- bq. The uploaded patch will let the normal heartbeat The intention was to let NM remove containers from its context only after RM acks it has received these containers. More context in YARN-1372 > NM keeps sending finished containers to RM until app is finished > > > Key: YARN-2997 > URL: https://issues.apache.org/jira/browse/YARN-2997 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Chengbing Liu > Attachments: YARN-2997.patch > > > We have seen in RM log a lot of > {quote} > INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: > Null container completed... > {quote} > It is caused by NM sending completed containers repeatedly until the app is > finished. On the RM side, the container is already released, hence > {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261792#comment-14261792 ] Chengbing Liu commented on YARN-2997: - [~kasha] [~jianhe] Thanks for your review. I think NM will call {{getNMContainerStatuses()}} instead of {{getContainerStatuses()}} upon receiving RESYNC from restarted RM. {{getNMContainerStatuses()}} remains unchanged and still reports all the completed containers for non-completed apps. The uploaded patch will let the normal heartbeat (not after receiving RESYNC) send only useful container status information to RM. So I guess the work-preserving RM restart is not affected. > NM keeps sending finished containers to RM until app is finished > > > Key: YARN-2997 > URL: https://issues.apache.org/jira/browse/YARN-2997 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Chengbing Liu > Attachments: YARN-2997.patch > > > We have seen in RM log a lot of > {quote} > INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: > Null container completed... > {quote} > It is caused by NM sending completed containers repeatedly until the app is > finished. On the RM side, the container is already released, hence > {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261781#comment-14261781 ] Hadoop QA commented on YARN-2998: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689600/yarn-2998-2.patch against trunk revision 6621c35. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6219//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6219//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6219//console This message is automatically generated. > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch, YARN-2998.003.patch, > yarn-2998-2.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2999) Compilation error in AllocationConfiguration.java in java1.7 while running tests
[ https://issues.apache.org/jira/browse/YARN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261775#comment-14261775 ] Jian He commented on YARN-2999: --- +1 > Compilation error in AllocationConfiguration.java in java1.7 while running > tests > > > Key: YARN-2999 > URL: https://issues.apache.org/jira/browse/YARN-2999 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith >Assignee: Rohith > Attachments: 0001-YARN-2999.patch > > > In AllocationConfiguration, in the below object creation, generic type must > be specified as instance variable,otherwise java1.7 lead compilation error > while running tests for RM and NM > {{reservableQueues = new HashSet<>();}} > Report : > {code} > java.lang.Error: Unresolved compilation problem: > '<>' operator is not allowed for source level below 1.7 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.(AllocationConfiguration.java:150) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1276) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1320) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:559) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:985) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:251) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2027) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.(TestRMRestart.java:2020) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testAppAttemptTokensRestoredOnRMRestart(TestRMRestart.java:1199) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2987) ClientRMService#getQueueInfo doesn't check app ACLs
[ https://issues.apache.org/jira/browse/YARN-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261769#comment-14261769 ] Hudson commented on YARN-2987: -- FAILURE: Integrated in Hadoop-trunk-Commit #6798 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6798/]) YARN-2987. Fixed ClientRMService#getQueueInfo to check against queue and app ACLs. Contributed by Varun Saxena (jianhe: rev e2351c7ae24cea9b217af4174512d279c55e8efd) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java > ClientRMService#getQueueInfo doesn't check app ACLs > --- > > Key: YARN-2987 > URL: https://issues.apache.org/jira/browse/YARN-2987 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Varun Saxena > Fix For: 2.7.0 > > Attachments: YARN-2987.001.patch, YARN-2987.002.patch > > > ClientRMService#getQueueInfo can return a list of applications belonging to > the queue, but doesn't actually check if the user has the permission to view > the applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2998: Attachment: YARN-2998.003.patch Addressed feedback > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch, YARN-2998.003.patch, > yarn-2998-2.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261758#comment-14261758 ] Jian He commented on YARN-2922: --- looks like {{LeafQueue#getTotalResourcePending}} should be changed too. does it make sense to synchronize on the object itself ? > Concurrent Modification Exception in LeafQueue when collecting applications > --- > > Key: YARN-2922 > URL: https://issues.apache.org/jira/browse/YARN-2922 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Rohith > Attachments: 0001-YARN-2922.patch > > > java.util.ConcurrentModificationException > at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2999) Compilation error in AllocationConfiguration.java in java1.7 while running tests
[ https://issues.apache.org/jira/browse/YARN-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2999: - Attachment: 0001-YARN-2999.patch Straight forward patch to fix this issue. > Compilation error in AllocationConfiguration.java in java1.7 while running > tests > > > Key: YARN-2999 > URL: https://issues.apache.org/jira/browse/YARN-2999 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Rohith >Assignee: Rohith > Attachments: 0001-YARN-2999.patch > > > In AllocationConfiguration, in the below object creation, generic type must > be specified as instance variable,otherwise java1.7 lead compilation error > while running tests for RM and NM > {{reservableQueues = new HashSet<>();}} > Report : > {code} > java.lang.Error: Unresolved compilation problem: > '<>' operator is not allowed for source level below 1.7 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.(AllocationConfiguration.java:150) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1276) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1320) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:559) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:985) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:251) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2027) > at > org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.(TestRMRestart.java:2020) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testAppAttemptTokensRestoredOnRMRestart(TestRMRestart.java:1199) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261752#comment-14261752 ] Rohith commented on YARN-2991: -- I splitted into 2 level for verifying it # Those test cases which uses DrainDispatcher : test completed,all passed. # All test cases yanr/mr: Few tests failed which is not really because of the this patch. NodeManager Tests Report {code} Results : Failed tests: TestDockerContainerExecutorWithMocks.testContainerLaunch:209 expected:<0> but was:<127> Tests run: 279, Failures: 1, Errors: 0, Skipped: 6 {code} ResourceManager Tests {code} Result : Failed Tests : // Many test related to fair scheduler are there. YARN-2999 Tests run: 986, Failures: 0, Errors: 178, Skipped: 1 {code} > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2999) Compilation error in AllocationConfiguration.java in java1.7 while running tests
Rohith created YARN-2999: Summary: Compilation error in AllocationConfiguration.java in java1.7 while running tests Key: YARN-2999 URL: https://issues.apache.org/jira/browse/YARN-2999 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith In AllocationConfiguration, in the below object creation, generic type must be specified as instance variable,otherwise java1.7 lead compilation error while running tests for RM and NM {{reservableQueues = new HashSet<>();}} Report : {code} java.lang.Error: Unresolved compilation problem: '<>' operator is not allowed for source level below 1.7 at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfiguration.(AllocationConfiguration.java:150) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1276) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1320) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:559) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:985) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:251) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.init(TestRMRestart.java:2027) at org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:108) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart$TestSecurityMockRM.(TestRMRestart.java:2020) at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testAppAttemptTokensRestoredOnRMRestart(TestRMRestart.java:1199) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261750#comment-14261750 ] Hadoop QA commented on YARN-2998: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689587/YARN-2998.001.patch against trunk revision 6621c35. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 13 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6218//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6218//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6218//console This message is automatically generated. > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch, yarn-2998-2.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2493) API changes for users
[ https://issues.apache.org/jira/browse/YARN-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261747#comment-14261747 ] Hudson commented on YARN-2493: -- FAILURE: Integrated in Hadoop-trunk-Commit #6797 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6797/]) YARN-2493. Added node-labels page on RM web UI. Contributed by Wangda Tan (jianhe: rev b7442bf92eb6e1ae64a0f9644ffc2eee4597aad5) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodeLabelsPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/YarnWebParams.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NodeLabel.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java > API changes for users > - > > Key: YARN-2493 > URL: https://issues.apache.org/jira/browse/YARN-2493 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api >Reporter: Wangda Tan >Assignee: Wangda Tan > Fix For: 2.6.0 > > Attachments: YARN-2493-20141008.1.patch, YARN-2493.patch, > YARN-2493.patch, YARN-2493.patch, YARN-2493.patch, YARN-2493.patch > > > This JIRA includes API changes for users of YARN-796, like changes in > {{ResourceRequest}}, {{ApplicationSubmissionContext}}, etc. This is a common > part of YARN-796. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2492) (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests
[ https://issues.apache.org/jira/browse/YARN-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261746#comment-14261746 ] Hudson commented on YARN-2492: -- FAILURE: Integrated in Hadoop-trunk-Commit #6797 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6797/]) Revert "YARN-2492(wrong jira number). Added node-labels page on RM web UI. Contributed by Wangda Tan" (jianhe: rev 746ad6e989683fe1dfc61a611702c9be7b5cd264) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodeLabelsPage.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NodeLabel.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/YarnWebParams.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java > (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests > > > Key: YARN-2492 > URL: https://issues.apache.org/jira/browse/YARN-2492 > Project: Hadoop YARN > Issue Type: Task > Components: api, client, resourcemanager >Reporter: Wangda Tan > > Since YARN-796 is a sub JIRA of YARN-397, this JIRA is used to create and > track sub tasks and attach split patches for YARN-796. > *Let's still keep over-all discussions on YARN-796.* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2943) Add a node-labels page in RM web UI
[ https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2943: -- Target Version/s: 2.7.0 (was: 2.7.0, 2.6.1) > Add a node-labels page in RM web UI > --- > > Key: YARN-2943 > URL: https://issues.apache.org/jira/browse/YARN-2943 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Fix For: 2.7.0 > > Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, > YARN-2943.1.patch, YARN-2943.2.patch, YARN-2943.3.patch, YARN-2943.4.patch, > YARN-2943.5.patch, YARN-2943.6.patch > > > Now we have node labels in the system, but there's no a very convenient to > get information like "how many active NM(s) assigned to a given label?", "how > much total resource for a give label?", "For a given label, which queues can > access it?", etc. > It will be better to add a node-labels page in RM web UI, users/admins can > have a centralized view to see such information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261745#comment-14261745 ] Wangda Tan commented on YARN-2922: -- Patch LGTM also, +1. Thanks, Wangda > Concurrent Modification Exception in LeafQueue when collecting applications > --- > > Key: YARN-2922 > URL: https://issues.apache.org/jira/browse/YARN-2922 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Rohith > Attachments: 0001-YARN-2922.patch > > > java.util.ConcurrentModificationException > at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2492) (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests
[ https://issues.apache.org/jira/browse/YARN-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261726#comment-14261726 ] Hudson commented on YARN-2492: -- FAILURE: Integrated in Hadoop-trunk-Commit #6796 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6796/]) YARN-2492. Added node-labels page on RM web UI. Contributed by Wangda Tan (jianhe: rev 5f57b904f550515693d93a2959e663b0d0260696) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/NodeLabel.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/YarnWebParams.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestNodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodeLabelsPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java > (Clone of YARN-796) Allow for (admin) labels on nodes and resource-requests > > > Key: YARN-2492 > URL: https://issues.apache.org/jira/browse/YARN-2492 > Project: Hadoop YARN > Issue Type: Task > Components: api, client, resourcemanager >Reporter: Wangda Tan > > Since YARN-796 is a sub JIRA of YARN-397, this JIRA is used to create and > track sub tasks and attach split patches for YARN-796. > *Let's still keep over-all discussions on YARN-796.* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261720#comment-14261720 ] Karthik Kambatla commented on YARN-2998: The refactoring makes sense. Few review comments: # AbstractSchedulerPlanFollower ## All newly added abstract methods can use a javadoc explaining what the method is supposed to do, so future followers don't have to look at other implementations. ## Rename #calculateTargetCapacity? Not very descriptive. ## Rename #isPlanResourcesLessThanReservations to #arePlanResourcesLessThanReservations or isPlanResourceLessThanReservations # ReservationSystemTestUtil#setupFairScheduler is not used. yarn-2992-2.patch removes it. > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch, yarn-2998-2.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2932) Add entry for preemption setting to queue status screen and startup/refresh logging
[ https://issues.apache.org/jira/browse/YARN-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261719#comment-14261719 ] Wangda Tan commented on YARN-2932: -- Hi [~eepayne], Thanks for working on this patch, sorry for late response since I was on vacation in the past few weeks. Just took a look at your patch, some comments: 1) Since the QUEUE_PREEMPTION_DISABLED is an option for CS, I suggest to make it as a member of CapacitySchedulerConfiguration, like {{getUserLimitFactor}}/{{setUserLimit}}, etc. This will void some String operations. 2) Rename {{context}} in {{AbstractCSQueue}} to name like {{csContext}} since we have {{rmContext}} 3) I suggest to add a member var like preemptable to AbstractCSQueue, instead of calling: {code} + @Private + public boolean isPreemptable() { +return context.getConfiguration().isPreemptable(getQueuePath()); + } {code} The implementation of CSConfiguration.isPreemptable(..) seems too complex to me. CSConfiguration should only care about value of configuration file, such logic should put to AbstractCSQueue.setupQueueConfigs(...) 4) It's better to web UI name (preemptable) and configuration name (disable_preemption) consistent. I prefer "preemptable" personally. 5) {{testIsPreemptable}} should be a part of TestCapacityScheduler instead of putting it to TestProportionalCapacityPreemptionPolicy. 6) In ProportionalCapacityPreemptionPolicy.cloneQueues, preemptable field should get from Queue instead of getting from configuration. Please let me know your thoughts, Wangda > Add entry for preemption setting to queue status screen and startup/refresh > logging > --- > > Key: YARN-2932 > URL: https://issues.apache.org/jira/browse/YARN-2932 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0, 2.7.0 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: YARN-2932.v1.txt > > > YARN-2056 enables the ability to turn preemption on or off on a per-queue > level. This JIRA will provide the preemption status for each queue in the > {{HOST:8088/cluster/scheduler}} UI and in the RM log during startup/queue > refresh. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2998: --- Attachment: yarn-2998-2.patch > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch, yarn-2998-2.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261691#comment-14261691 ] Karthik Kambatla commented on YARN-2998: I got confused looking at the wrong method. The only part that doesn't belong here is ReservationSystem#setupFairScheduler. Will post a patch shortly removing that method. Could you include it in YARN-2881. Thanks. > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261682#comment-14261682 ] Jian He commented on YARN-2997: --- bq. add code to identify them duplicates and then ignore. I think it is now ignored. perhaps we should clarify the logging and move it to debug level. > NM keeps sending finished containers to RM until app is finished > > > Key: YARN-2997 > URL: https://issues.apache.org/jira/browse/YARN-2997 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Chengbing Liu > Attachments: YARN-2997.patch > > > We have seen in RM log a lot of > {quote} > INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: > Null container completed... > {quote} > It is caused by NM sending completed containers repeatedly until the app is > finished. On the RM side, the container is already released, hence > {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2943) Add a node-labels page in RM web UI
[ https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261669#comment-14261669 ] Wangda Tan commented on YARN-2943: -- Verified in a local 3-nodes cluster, both nodes page and node-labels page work fine. > Add a node-labels page in RM web UI > --- > > Key: YARN-2943 > URL: https://issues.apache.org/jira/browse/YARN-2943 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, > YARN-2943.1.patch, YARN-2943.2.patch, YARN-2943.3.patch, YARN-2943.4.patch, > YARN-2943.5.patch, YARN-2943.6.patch > > > Now we have node labels in the system, but there's no a very convenient to > get information like "how many active NM(s) assigned to a given label?", "how > much total resource for a give label?", "For a given label, which queues can > access it?", etc. > It will be better to add a node-labels page in RM web UI, users/admins can > have a centralized view to see such information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261647#comment-14261647 ] Karthik Kambatla commented on YARN-2998: We can leave TestFairReservationSystem and ReservationSystemTestUtil#setupFSAllocationFile for YARN-2881. > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261644#comment-14261644 ] Zhijie Shen commented on YARN-2991: --- The patch looks good. Did you see any problem when running all test cases? > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2814) RM: Clean-up the handling of "fatal" events
[ https://issues.apache.org/jira/browse/YARN-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2814: --- Attachment: yarn-2814-0.patch yarn-2814-0.patch is a preview of what I was thinking. It removes RMFatalEvent* and associated code and cleans up a few other methods in the RM. The dispatcher, I believe, can be moved to RMActiveServices. Will update the patch again post that investigation. Welcome any early feedback. > RM: Clean-up the handling of "fatal" events > --- > > Key: YARN-2814 > URL: https://issues.apache.org/jira/browse/YARN-2814 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-2814-0.patch > > > YARN-2579 fixes a critical issue around handling fatal events in the RM, but > does so minimally. This JIRA is to follow through that approach and do more > clean-up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261630#comment-14261630 ] Anubhav Dhoot commented on YARN-2881: - Added patch that builds on top of YARN-2998 > Implement PlanFollower for FairScheduler > > > Key: YARN-2881 > URL: https://issues.apache.org/jira/browse/YARN-2881 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2881.001.patch, YARN-2881.002.patch, > YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2881: Attachment: YARN-2881.003.patch > Implement PlanFollower for FairScheduler > > > Key: YARN-2881 > URL: https://issues.apache.org/jira/browse/YARN-2881 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2881.001.patch, YARN-2881.002.patch, > YARN-2881.002.patch, YARN-2881.003.patch, YARN-2881.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2998: Attachment: YARN-2998.001.patch Creates AbstractSchedulerPlanFollower > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2998.001.patch > > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2998: --- Description: Abstract out scheduler independent PlanFollower components into AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > > Abstract out scheduler independent PlanFollower components into > AbstractSchedulerPlanFollower. This is a pre-requisite for YARN-2881. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2998) Abstract out scheduler independent PlanFollower components
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2998: --- Summary: Abstract out scheduler independent PlanFollower components (was: Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower) > Abstract out scheduler independent PlanFollower components > -- > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2998) Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower
Anubhav Dhoot created YARN-2998: --- Summary: Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower Key: YARN-2998 URL: https://issues.apache.org/jira/browse/YARN-2998 Project: Hadoop YARN Issue Type: Sub-task Reporter: Anubhav Dhoot -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2998) Abstract out scheduler independant PlanFollower components into AbstractSchedulerPLanFollower
[ https://issues.apache.org/jira/browse/YARN-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot reassigned YARN-2998: --- Assignee: Anubhav Dhoot > Abstract out scheduler independant PlanFollower components into > AbstractSchedulerPLanFollower > - > > Key: YARN-2998 > URL: https://issues.apache.org/jira/browse/YARN-2998 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2217) Shared cache client side changes
[ https://issues.apache.org/jira/browse/YARN-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261542#comment-14261542 ] Karthik Kambatla commented on YARN-2217: Instead of a convenience method to calculate checksum, how about modifying {{use}} to take a Path instead of String. Or, have two {{use}} methods one each of Path and String? We should probably add unit tests too? > Shared cache client side changes > > > Key: YARN-2217 > URL: https://issues.apache.org/jira/browse/YARN-2217 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Chris Trezzo >Assignee: Chris Trezzo > Attachments: YARN-2217-trunk-v1.patch, YARN-2217-trunk-v2.patch, > YARN-2217-trunk-v3.patch, YARN-2217-trunk-v4.patch, YARN-2217-trunk-v5.patch > > > Implement the client side changes for the shared cache. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
[ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261526#comment-14261526 ] Karthik Kambatla commented on YARN-2997: With work-preserving restart, the NM is required to intimate the RM repeatedly in case the RM goes down and loses this information. I propose we ignore the latter updates, or add code to identify them duplicates and then ignore. > NM keeps sending finished containers to RM until app is finished > > > Key: YARN-2997 > URL: https://issues.apache.org/jira/browse/YARN-2997 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Chengbing Liu > Attachments: YARN-2997.patch > > > We have seen in RM log a lot of > {quote} > INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: > Null container completed... > {quote} > It is caused by NM sending completed containers repeatedly until the app is > finished. On the RM side, the container is already released, hence > {{getRMContainer}} returns null. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261465#comment-14261465 ] Hadoop QA commented on YARN-2991: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689552/0002-YARN-2991.patch against trunk revision 6621c35. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6217//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6217//console This message is automatically generated. > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261426#comment-14261426 ] Rohith commented on YARN-2991: -- Kindly review the updated patch. > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2991: - Attachment: 0002-YARN-2991.patch > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch, 0002-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261410#comment-14261410 ] Zhijie Shen commented on YARN-2991: --- bq. I can think of having public boolean isDrained() in AsynDispatcher for getting drained status Sounds good to me. > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261388#comment-14261388 ] Rohith commented on YARN-2991: -- I can think of having public boolean isDrained() in AsynDispatcher for getting drained status and do the same logic. I will try to run all the tests with above change for impact and will upload new patch. > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2958) RMStateStore seems to unnecessarily and wrongly store sequence number separately
[ https://issues.apache.org/jira/browse/YARN-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261385#comment-14261385 ] Varun Saxena commented on YARN-2958: [~jianhe] / [~zjshen], kindly review > RMStateStore seems to unnecessarily and wrongly store sequence number > separately > > > Key: YARN-2958 > URL: https://issues.apache.org/jira/browse/YARN-2958 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: Zhijie Shen >Assignee: Varun Saxena >Priority: Blocker > Attachments: YARN-2958.001.patch, YARN-2958.002.patch, > YARN-2958.003.patch > > > It seems that RMStateStore updates last sequence number when storing or > updating each individual DT, to recover the latest sequence number when RM > restarting. > First, the current logic seems to be problematic: > {code} > public synchronized void updateRMDelegationTokenAndSequenceNumber( > RMDelegationTokenIdentifier rmDTIdentifier, Long renewDate, > int latestSequenceNumber) { > if(isFencedState()) { > LOG.info("State store is in Fenced state. Can't update RM Delegation > Token."); > return; > } > try { > updateRMDelegationTokenAndSequenceNumberInternal(rmDTIdentifier, > renewDate, > latestSequenceNumber); > } catch (Exception e) { > notifyStoreOperationFailed(e); > } > } > {code} > {code} > @Override > protected void updateStoredToken(RMDelegationTokenIdentifier id, > long renewDate) { > try { > LOG.info("updating RMDelegation token with sequence number: " > + id.getSequenceNumber()); > rmContext.getStateStore().updateRMDelegationTokenAndSequenceNumber(id, > renewDate, id.getSequenceNumber()); > } catch (Exception e) { > LOG.error("Error in updating persisted RMDelegationToken with sequence > number: " > + id.getSequenceNumber()); > ExitUtil.terminate(1, e); > } > } > {code} > According to code above, even when renewing a DT, the last sequence number is > updated in the store, which is wrong. For example, we have the following > sequence: > 1. Get DT 1 (seq = 1) > 2. Get DT 2( seq = 2) > 3. Renew DT 1 (seq = 1) > 4. Restart RM > The stored and then recovered last sequence number is 1. It makes the next > created DT after RM restarting will conflict with DT 2 on sequence num. > Second, the aforementioned bug doesn't happen actually, because the recovered > last sequence num has been overwritten at by the correctly one. > {code} > public void recover(RMState rmState) throws Exception { > LOG.info("recovering RMDelegationTokenSecretManager."); > // recover RMDTMasterKeys > for (DelegationKey dtKey : rmState.getRMDTSecretManagerState() > .getMasterKeyState()) { > addKey(dtKey); > } > // recover RMDelegationTokens > Map rmDelegationTokens = > rmState.getRMDTSecretManagerState().getTokenState(); > this.delegationTokenSequenceNumber = > rmState.getRMDTSecretManagerState().getDTSequenceNumber(); > for (Map.Entry entry : > rmDelegationTokens > .entrySet()) { > addPersistedDelegationToken(entry.getKey(), entry.getValue()); > } > } > {code} > The code above recovers delegationTokenSequenceNumber by reading the last > sequence number in the store. It could be wrong. Fortunately, > delegationTokenSequenceNumber updates it to the right number. > {code} > if (identifier.getSequenceNumber() > getDelegationTokenSeqNum()) { > setDelegationTokenSeqNum(identifier.getSequenceNumber()); > } > {code} > All the stored identifiers will be gone through, and > delegationTokenSequenceNumber will be set to the largest sequence number > among these identifiers. Therefore, new DT will be assigned a sequence number > which is always larger than that of all the recovered DT. > To sum up, two negatives make a positive, but it's good to fix the issue. > Please let me know if I've missed something here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2922) Concurrent Modification Exception in LeafQueue when collecting applications
[ https://issues.apache.org/jira/browse/YARN-2922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261367#comment-14261367 ] Rohith commented on YARN-2922: -- Given if patch is fine, could it commit please? > Concurrent Modification Exception in LeafQueue when collecting applications > --- > > Key: YARN-2922 > URL: https://issues.apache.org/jira/browse/YARN-2922 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Rohith > Attachments: 0001-YARN-2922.patch > > > java.util.ConcurrentModificationException > at > java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1115) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1169) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.collectSchedulerApplications(LeafQueue.java:1618) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getAppsInQueue(CapacityScheduler.java:1119) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getQueueInfo(ClientRMService.java:798) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:234) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at > org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261360#comment-14261360 ] Rohith commented on YARN-2991: -- bq. The bug you have found with DrainDispatcher seems to be handled properly in AsyncDispatcher Yes, it is handled. bq. How about removing the overriding methods in DrainDispatcher, and only adding await()? This I was thought initially before uploading current patch. But since visibility of instance variable *drained* will change from private to protected only for test framework. Given it is accepted, we can change visibility. > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261350#comment-14261350 ] Zhijie Shen commented on YARN-2991: --- After [YARN-1121|https://issues.apache.org/jira/secure/EditComment!default.jspa?id=12666077&commentId=13808704], AsyncDispatcher already has the draining logic, but only doesn't provide await() method to users. The bug you have found with DrainDispatcher seems to be handled properly in AsyncDispatcher. How about removing the overriding methods in DrainDispatcher, and only adding await()? > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261266#comment-14261266 ] Hadoop QA commented on YARN-2881: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689514/YARN-2881.002.patch against trunk revision 249cc90. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 14 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart org.apache.hadoop.yarn.server.resourcemanager.TestRMHA The following test timeouts occurred in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6216//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6216//artifact/patchprocess/newPatchFindbugsWarningshadoop-sls.html Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6216//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6216//console This message is automatically generated. > Implement PlanFollower for FairScheduler > > > Key: YARN-2881 > URL: https://issues.apache.org/jira/browse/YARN-2881 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2881.001.patch, YARN-2881.002.patch, > YARN-2881.002.patch, YARN-2881.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2987) ClientRMService#getQueueInfo doesn't check app ACLs
[ https://issues.apache.org/jira/browse/YARN-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261259#comment-14261259 ] Varun Saxena commented on YARN-2987: [~jianhe], test failure is unrelated. Passing in my local > ClientRMService#getQueueInfo doesn't check app ACLs > --- > > Key: YARN-2987 > URL: https://issues.apache.org/jira/browse/YARN-2987 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jian He >Assignee: Varun Saxena > Attachments: YARN-2987.001.patch, YARN-2987.002.patch > > > ClientRMService#getQueueInfo can return a list of applications belonging to > the queue, but doesn't actually check if the user has the permission to view > the applications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2978) ResourceManager crashes with NPE while getting queue info
[ https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2978: --- Summary: ResourceManager crashes with NPE while getting queue info (was: RM crashes with NPE while getting queue info) > ResourceManager crashes with NPE while getting queue info > - > > Key: YARN-2978 > URL: https://issues.apache.org/jira/browse/YARN-2978 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Varun Saxena > > java.lang.NullPointerException > at > org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625) > at > org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939) > at > org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290) > at > org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2978) RM crashes with NPE while getting queue info
[ https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2978: --- Summary: RM crashes with NPE while getting queue info (was: Null pointer in YarnProtos) > RM crashes with NPE while getting queue info > > > Key: YARN-2978 > URL: https://issues.apache.org/jira/browse/YARN-2978 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.5.1 >Reporter: Jason Tufo >Assignee: Varun Saxena > > java.lang.NullPointerException > at > org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625) > at > org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939) > at > org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290) > at > org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261172#comment-14261172 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2008 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2008/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java > Fix new findbugs warnings in hadoop-yarn-resourcemanager and > hadoop-yarn-applicationhistoryservice > -- > > Key: YARN-2938 > URL: https://issues.apache.org/jira/browse/YARN-2938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.7.0 > > Attachments: FindBugs Report.html, YARN-2938.001.patch, > YARN-2938.002.patch, YARN-2938.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2881) Implement PlanFollower for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2881: Attachment: YARN-2881.002.patch Kicking jenkins again as the failures seem unrelated > Implement PlanFollower for FairScheduler > > > Key: YARN-2881 > URL: https://issues.apache.org/jira/browse/YARN-2881 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Attachments: YARN-2881.001.patch, YARN-2881.002.patch, > YARN-2881.002.patch, YARN-2881.prelim.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261150#comment-14261150 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #58 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/58/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java > Fix new findbugs warnings in hadoop-yarn-resourcemanager and > hadoop-yarn-applicationhistoryservice > -- > > Key: YARN-2938 > URL: https://issues.apache.org/jira/browse/YARN-2938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.7.0 > > Attachments: FindBugs Report.html, YARN-2938.001.patch, > YARN-2938.002.patch, YARN-2938.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261130#comment-14261130 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #54 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/54/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java > Fix new findbugs warnings in hadoop-yarn-resourcemanager and > hadoop-yarn-applicationhistoryservice > -- > > Key: YARN-2938 > URL: https://issues.apache.org/jira/browse/YARN-2938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.7.0 > > Attachments: FindBugs Report.html, YARN-2938.001.patch, > YARN-2938.002.patch, YARN-2938.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261122#comment-14261122 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1989 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1989/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java > Fix new findbugs warnings in hadoop-yarn-resourcemanager and > hadoop-yarn-applicationhistoryservice > -- > > Key: YARN-2938 > URL: https://issues.apache.org/jira/browse/YARN-2938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.7.0 > > Attachments: FindBugs Report.html, YARN-2938.001.patch, > YARN-2938.002.patch, YARN-2938.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2797) TestWorkPreservingRMRestart should use ParametrizedSchedulerTestBase
[ https://issues.apache.org/jira/browse/YARN-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261092#comment-14261092 ] Rohith commented on YARN-2797: -- YARN-2991 corresponding jira fixing one of the test case random failure. > TestWorkPreservingRMRestart should use ParametrizedSchedulerTestBase > > > Key: YARN-2797 > URL: https://issues.apache.org/jira/browse/YARN-2797 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.5.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Minor > Attachments: yarn-2797-1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2797) TestWorkPreservingRMRestart should use ParametrizedSchedulerTestBase
[ https://issues.apache.org/jira/browse/YARN-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261090#comment-14261090 ] Rohith commented on YARN-2797: -- Make sense, +1 for approach. > TestWorkPreservingRMRestart should use ParametrizedSchedulerTestBase > > > Key: YARN-2797 > URL: https://issues.apache.org/jira/browse/YARN-2797 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 2.5.1 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla >Priority: Minor > Attachments: yarn-2797-1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2991) TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on trunk
[ https://issues.apache.org/jira/browse/YARN-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261088#comment-14261088 ] Rohith commented on YARN-2991: -- Hi [~zjshen], kindly review the analysis and patch > TestRMRestart.testDecomissionedNMsMetricsOnRMRestart intermittently fails on > trunk > -- > > Key: YARN-2991 > URL: https://issues.apache.org/jira/browse/YARN-2991 > Project: Hadoop YARN > Issue Type: Test >Reporter: Zhijie Shen >Assignee: Rohith >Priority: Blocker > Attachments: 0001-YARN-2991.patch > > > {code} > Error Message > test timed out after 6 milliseconds > Stacktrace > java.lang.Exception: test timed out after 6 milliseconds > at java.lang.Object.wait(Native Method) > at java.lang.Thread.join(Thread.java:1281) > at java.lang.Thread.join(Thread.java:1355) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.serviceStop(AsyncDispatcher.java:150) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.service.ServiceOperations.stop(ServiceOperations.java:52) > at > org.apache.hadoop.service.ServiceOperations.stopQuietly(ServiceOperations.java:80) > at > org.apache.hadoop.service.CompositeService.stop(CompositeService.java:157) > at > org.apache.hadoop.service.CompositeService.serviceStop(CompositeService.java:131) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStop(ResourceManager.java:1106) > at > org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221) > at > org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testDecomissionedNMsMetricsOnRMRestart(TestRMRestart.java:1873) > {code} > It happened twice this months: > https://builds.apache.org/job/PreCommit-YARN-Build/6096/ > https://builds.apache.org/job/PreCommit-YARN-Build/6182/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261040#comment-14261040 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Yarn-trunk #791 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/791/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java > Fix new findbugs warnings in hadoop-yarn-resourcemanager and > hadoop-yarn-applicationhistoryservice > -- > > Key: YARN-2938 > URL: https://issues.apache.org/jira/browse/YARN-2938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.7.0 > > Attachments: FindBugs Report.html, YARN-2938.001.patch, > YARN-2938.002.patch, YARN-2938.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2938) Fix new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice
[ https://issues.apache.org/jira/browse/YARN-2938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261027#comment-14261027 ] Hudson commented on YARN-2938: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #57 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/57/]) YARN-2938. Fixed new findbugs warnings in hadoop-yarn-resourcemanager and hadoop-yarn-applicationhistoryservice. Contributed by Varun Saxena. (zjshen: rev 241d3b3a50c6af92f023d8b2c24598f4813f4674) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NodesPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/LeveldbTimelineStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/MemoryTimelineStore.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java > Fix new findbugs warnings in hadoop-yarn-resourcemanager and > hadoop-yarn-applicationhistoryservice > -- > > Key: YARN-2938 > URL: https://issues.apache.org/jira/browse/YARN-2938 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Varun Saxena >Assignee: Varun Saxena > Fix For: 2.7.0 > > Attachments: FindBugs Report.html, YARN-2938.001.patch, > YARN-2938.002.patch, YARN-2938.003.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-149) ResourceManager (RM) High-Availability (HA)
[ https://issues.apache.org/jira/browse/YARN-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wang Haoran updated YARN-149: - Affects Version/s: (was: 2.4.0) > ResourceManager (RM) High-Availability (HA) > --- > > Key: YARN-149 > URL: https://issues.apache.org/jira/browse/YARN-149 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Harsh J > Labels: patch > Attachments: YARN ResourceManager Automatic > Failover-rev-07-21-13.pdf, YARN ResourceManager Automatic > Failover-rev-08-04-13.pdf, rm-ha-phase1-approach-draft1.pdf, > rm-ha-phase1-draft2.pdf > > > This jira tracks work needed to be done to support one RM instance failing > over to another RM instance so that we can have RM HA. Work includes leader > election, transfer of control to leader and client re-direction to new leader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2943) Add a node-labels page in RM web UI
[ https://issues.apache.org/jira/browse/YARN-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14260903#comment-14260903 ] Hadoop QA commented on YARN-2943: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689445/YARN-2943.6.patch against trunk revision 249cc90. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart The following test timeouts occurred in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6214//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6214//console This message is automatically generated. > Add a node-labels page in RM web UI > --- > > Key: YARN-2943 > URL: https://issues.apache.org/jira/browse/YARN-2943 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: Node-labels-page.png, Nodes-page-with-label-filter.png, > YARN-2943.1.patch, YARN-2943.2.patch, YARN-2943.3.patch, YARN-2943.4.patch, > YARN-2943.5.patch, YARN-2943.6.patch > > > Now we have node labels in the system, but there's no a very convenient to > get information like "how many active NM(s) assigned to a given label?", "how > much total resource for a give label?", "For a given label, which queues can > access it?", etc. > It will be better to add a node-labels page in RM web UI, users/admins can > have a centralized view to see such information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)