[jira] [Updated] (YARN-2701) Potential race condition in startLocalizer when using LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-2701: Attachment: YARN-2701.3.patch Potential race condition in startLocalizer when using LinuxContainerExecutor -- Key: YARN-2701 URL: https://issues.apache.org/jira/browse/YARN-2701 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Priority: Blocker Attachments: YARN-2701.1.patch, YARN-2701.2.patch, YARN-2701.3.patch When using LinuxContainerExecutor do startLocalizer, we are using native code container-executor.c. {code} if (stat(npath, sb) != 0) { if (mkdir(npath, perm) != 0) { {code} We are using check and create method to create the appDir under /usercache. But if there are two containers trying to do this at the same time, race condition may happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2701) Potential race condition in startLocalizer when using LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175907#comment-14175907 ] Xuan Gong commented on YARN-2701: - Thanks for the review. Addressed all comments Potential race condition in startLocalizer when using LinuxContainerExecutor -- Key: YARN-2701 URL: https://issues.apache.org/jira/browse/YARN-2701 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Priority: Blocker Attachments: YARN-2701.1.patch, YARN-2701.2.patch, YARN-2701.3.patch When using LinuxContainerExecutor do startLocalizer, we are using native code container-executor.c. {code} if (stat(npath, sb) != 0) { if (mkdir(npath, perm) != 0) { {code} We are using check and create method to create the appDir under /usercache. But if there are two containers trying to do this at the same time, race condition may happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175916#comment-14175916 ] Jian He commented on YARN-2710: --- [~leftnoteasy], it actually passes on my side. could you double check please ? thx RM HA tests failed intermittently on trunk -- Key: YARN-2710 URL: https://issues.apache.org/jira/browse/YARN-2710 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Wangda Tan Failure like, it can be happened in TestApplicationClientProtocolOnHA, TestResourceTrackerOnHA, etc. {code} org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) Time elapsed: 9.491 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583) at org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2701) Potential race condition in startLocalizer when using LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175917#comment-14175917 ] Hadoop QA commented on YARN-2701: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12675661/YARN-2701.3.patch against trunk revision 39063cd. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5453//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5453//console This message is automatically generated. Potential race condition in startLocalizer when using LinuxContainerExecutor -- Key: YARN-2701 URL: https://issues.apache.org/jira/browse/YARN-2701 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Priority: Blocker Attachments: YARN-2701.1.patch, YARN-2701.2.patch, YARN-2701.3.patch When using LinuxContainerExecutor do startLocalizer, we are using native code container-executor.c. {code} if (stat(npath, sb) != 0) { if (mkdir(npath, perm) != 0) { {code} We are using check and create method to create the appDir under /usercache. But if there are two containers trying to do this at the same time, race condition may happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2701) Potential race condition in startLocalizer when using LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175919#comment-14175919 ] Hadoop QA commented on YARN-2701: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12675661/YARN-2701.3.patch against trunk revision 39063cd. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5454//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5454//console This message is automatically generated. Potential race condition in startLocalizer when using LinuxContainerExecutor -- Key: YARN-2701 URL: https://issues.apache.org/jira/browse/YARN-2701 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Priority: Blocker Attachments: YARN-2701.1.patch, YARN-2701.2.patch, YARN-2701.3.patch When using LinuxContainerExecutor do startLocalizer, we are using native code container-executor.c. {code} if (stat(npath, sb) != 0) { if (mkdir(npath, perm) != 0) { {code} We are using check and create method to create the appDir under /usercache. But if there are two containers trying to do this at the same time, race condition may happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2588) Standby RM does not transitionToActive if previous transitionToActive is failed with ZK exception.
[ https://issues.apache.org/jira/browse/YARN-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175951#comment-14175951 ] Hudson commented on YARN-2588: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #716 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/716/]) YARN-2588. Standby RM fails to transitionToActive if previous transitionToActive failed with ZK exception. Contributed by Rohith Sharmaks (jianhe: rev a6aa6e42cacdbfcc1c2b7c19e7239204fe9ff654) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Standby RM does not transitionToActive if previous transitionToActive is failed with ZK exception. -- Key: YARN-2588 URL: https://issues.apache.org/jira/browse/YARN-2588 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.6.0, 2.5.1 Reporter: Rohith Assignee: Rohith Fix For: 2.6.0 Attachments: YARN-2588.1.patch, YARN-2588.2.patch, YARN-2588.patch Consider scenario where, StandBy RM is failed to transition to Active because of ZK exception(connectionLoss or SessionExpired). Then any further transition to Active for same RM does not move RM to Active state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2699) Fix test timeout in TestResourceTrackerOnHA#testResourceTrackerOnHA
[ https://issues.apache.org/jira/browse/YARN-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175957#comment-14175957 ] Hudson commented on YARN-2699: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #716 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/716/]) YARN-2699. Fixed a bug in CommonNodeLabelsManager that caused tests to fail when using ephemeral ports on NodeIDs. Contributed by Wangda Tan. (vinodkv: rev abae63caf9c53b404f2f2db7d482555484eaeaf8) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java Fix test timeout in TestResourceTrackerOnHA#testResourceTrackerOnHA --- Key: YARN-2699 URL: https://issues.apache.org/jira/browse/YARN-2699 Project: Hadoop YARN Issue Type: Sub-task Components: client Reporter: Wangda Tan Assignee: Wangda Tan Priority: Blocker Fix For: 2.6.0 Attachments: YARN-2699-20141016-1.patch, YARN-2699-20141016-2.patch, YARN-2699-20141016-3.patch, YARN-2699-20141016-4.patch Because of changes by YARN-2500/YARN-2496/YARN-2494, now registering a node manager with port=0 is not allowed. TestResourceTrackerOnHA#testResourceTrackerOnHA will be failed since it register a node manager with port = 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2676) Timeline authentication filter should add support for proxy user
[ https://issues.apache.org/jira/browse/YARN-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175959#comment-14175959 ] Hudson commented on YARN-2676: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #716 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/716/]) YARN-2676. Enhanced Timeline auth-filter to support proxy users. Contributed by Zhijie Shen. (vinodkv: rev 39063cd36f96e351e4a6bf0bc2b6185711d4b059) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/security/TestTimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServicesWithSSL.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineClientAuthenticationService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/pom.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/security/TestTimelineAuthenticationFilter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/YARNDelegationTokenIdentifier.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineAuthenticator.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineAuthenticator.java Timeline authentication filter should add support for proxy user Key: YARN-2676 URL: https://issues.apache.org/jira/browse/YARN-2676 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.6.0 Attachments: YARN-2676.1.patch, YARN-2676.2.patch, YARN-2676.3.patch, YARN-2676.4.patch, YARN-2676.5.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2705) Changes of RM node label manager default configuration
[ https://issues.apache.org/jira/browse/YARN-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175956#comment-14175956 ] Hudson commented on YARN-2705: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #716 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/716/]) YARN-2705. Fixed bugs in ResourceManager node-label manager that were causing test-failures: added a dummy in-memory labels-manager. Contributed by Wangda Tan. (vinodkv: rev e9c66e8fd2ccb658db2848e1ab911f1502de4de5) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/MemoryRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/FileSystemNodeLabelsStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/DummyRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java Changes of RM node label manager default configuration -- Key: YARN-2705 URL: https://issues.apache.org/jira/browse/YARN-2705 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Fix For: 2.6.0 Attachments: YARN-2705-20141017-1.patch, YARN-2705-20141017-2.patch 1) Add yarn.node-labels.manager-class, by default it's will not store anything to file system 2) Use above at least in some places: RMNodeLabelsManager, RMAdminCLI. Convert {{DummyNodeLabelsManager}} into a {{MemoryNodeLabelsManager}} 3) Document that RM configs and client configs for yarn.node-labels.manager-class should match 4) fs-store.uri - fs-store.root-dir 5) Similarly FS_NODE_LABELS_STORE_URI 6) For default value of fs-store.uri, put it in /tmp. But creaat /tmp/hadoop-yarn-$\{user\}/node-labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over
[ https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175950#comment-14175950 ] Hudson commented on YARN-1879: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #716 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/716/]) YARN-1879. Marked Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over. Contributed by Tsuyoshi OZAWA (jianhe: rev c3de2412eb7633ff16c67e71e73bbe27a982d984) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationMasterProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockAM.java Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over Key: YARN-1879 URL: https://issues.apache.org/jira/browse/YARN-1879 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Tsuyoshi OZAWA Priority: Critical Fix For: 2.6.0 Attachments: YARN-1879.1.patch, YARN-1879.1.patch, YARN-1879.11.patch, YARN-1879.12.patch, YARN-1879.13.patch, YARN-1879.14.patch, YARN-1879.15.patch, YARN-1879.16.patch, YARN-1879.17.patch, YARN-1879.18.patch, YARN-1879.19.patch, YARN-1879.2-wip.patch, YARN-1879.2.patch, YARN-1879.20.patch, YARN-1879.21.patch, YARN-1879.22.patch, YARN-1879.23.patch, YARN-1879.23.patch, YARN-1879.24.patch, YARN-1879.25.patch, YARN-1879.26.patch, YARN-1879.27.patch, YARN-1879.28.patch, YARN-1879.29.patch, YARN-1879.3.patch, YARN-1879.4.patch, YARN-1879.5.patch, YARN-1879.6.patch, YARN-1879.7.patch, YARN-1879.8.patch, YARN-1879.9.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2588) Standby RM does not transitionToActive if previous transitionToActive is failed with ZK exception.
[ https://issues.apache.org/jira/browse/YARN-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175996#comment-14175996 ] Hudson commented on YARN-2588: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1905 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1905/]) YARN-2588. Standby RM fails to transitionToActive if previous transitionToActive failed with ZK exception. Contributed by Rohith Sharmaks (jianhe: rev a6aa6e42cacdbfcc1c2b7c19e7239204fe9ff654) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Standby RM does not transitionToActive if previous transitionToActive is failed with ZK exception. -- Key: YARN-2588 URL: https://issues.apache.org/jira/browse/YARN-2588 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.6.0, 2.5.1 Reporter: Rohith Assignee: Rohith Fix For: 2.6.0 Attachments: YARN-2588.1.patch, YARN-2588.2.patch, YARN-2588.patch Consider scenario where, StandBy RM is failed to transition to Active because of ZK exception(connectionLoss or SessionExpired). Then any further transition to Active for same RM does not move RM to Active state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over
[ https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14175995#comment-14175995 ] Hudson commented on YARN-1879: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1905 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1905/]) YARN-1879. Marked Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over. Contributed by Tsuyoshi OZAWA (jianhe: rev c3de2412eb7633ff16c67e71e73bbe27a982d984) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockAM.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationMasterProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over Key: YARN-1879 URL: https://issues.apache.org/jira/browse/YARN-1879 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Tsuyoshi OZAWA Priority: Critical Fix For: 2.6.0 Attachments: YARN-1879.1.patch, YARN-1879.1.patch, YARN-1879.11.patch, YARN-1879.12.patch, YARN-1879.13.patch, YARN-1879.14.patch, YARN-1879.15.patch, YARN-1879.16.patch, YARN-1879.17.patch, YARN-1879.18.patch, YARN-1879.19.patch, YARN-1879.2-wip.patch, YARN-1879.2.patch, YARN-1879.20.patch, YARN-1879.21.patch, YARN-1879.22.patch, YARN-1879.23.patch, YARN-1879.23.patch, YARN-1879.24.patch, YARN-1879.25.patch, YARN-1879.26.patch, YARN-1879.27.patch, YARN-1879.28.patch, YARN-1879.29.patch, YARN-1879.3.patch, YARN-1879.4.patch, YARN-1879.5.patch, YARN-1879.6.patch, YARN-1879.7.patch, YARN-1879.8.patch, YARN-1879.9.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2705) Changes of RM node label manager default configuration
[ https://issues.apache.org/jira/browse/YARN-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176001#comment-14176001 ] Hudson commented on YARN-2705: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1905 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1905/]) YARN-2705. Fixed bugs in ResourceManager node-label manager that were causing test-failures: added a dummy in-memory labels-manager. Contributed by Wangda Tan. (vinodkv: rev e9c66e8fd2ccb658db2848e1ab911f1502de4de5) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/FileSystemNodeLabelsStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/MemoryRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/DummyRMNodeLabelsManager.java Changes of RM node label manager default configuration -- Key: YARN-2705 URL: https://issues.apache.org/jira/browse/YARN-2705 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Fix For: 2.6.0 Attachments: YARN-2705-20141017-1.patch, YARN-2705-20141017-2.patch 1) Add yarn.node-labels.manager-class, by default it's will not store anything to file system 2) Use above at least in some places: RMNodeLabelsManager, RMAdminCLI. Convert {{DummyNodeLabelsManager}} into a {{MemoryNodeLabelsManager}} 3) Document that RM configs and client configs for yarn.node-labels.manager-class should match 4) fs-store.uri - fs-store.root-dir 5) Similarly FS_NODE_LABELS_STORE_URI 6) For default value of fs-store.uri, put it in /tmp. But creaat /tmp/hadoop-yarn-$\{user\}/node-labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2699) Fix test timeout in TestResourceTrackerOnHA#testResourceTrackerOnHA
[ https://issues.apache.org/jira/browse/YARN-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176002#comment-14176002 ] Hudson commented on YARN-2699: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1905 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1905/]) YARN-2699. Fixed a bug in CommonNodeLabelsManager that caused tests to fail when using ephemeral ports on NodeIDs. Contributed by Wangda Tan. (vinodkv: rev abae63caf9c53b404f2f2db7d482555484eaeaf8) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java Fix test timeout in TestResourceTrackerOnHA#testResourceTrackerOnHA --- Key: YARN-2699 URL: https://issues.apache.org/jira/browse/YARN-2699 Project: Hadoop YARN Issue Type: Sub-task Components: client Reporter: Wangda Tan Assignee: Wangda Tan Priority: Blocker Fix For: 2.6.0 Attachments: YARN-2699-20141016-1.patch, YARN-2699-20141016-2.patch, YARN-2699-20141016-3.patch, YARN-2699-20141016-4.patch Because of changes by YARN-2500/YARN-2496/YARN-2494, now registering a node manager with port=0 is not allowed. TestResourceTrackerOnHA#testResourceTrackerOnHA will be failed since it register a node manager with port = 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2676) Timeline authentication filter should add support for proxy user
[ https://issues.apache.org/jira/browse/YARN-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176004#comment-14176004 ] Hudson commented on YARN-2676: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1905 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1905/]) YARN-2676. Enhanced Timeline auth-filter to support proxy users. Contributed by Zhijie Shen. (vinodkv: rev 39063cd36f96e351e4a6bf0bc2b6185711d4b059) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineAuthenticator.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/security/TestTimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServicesWithSSL.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/pom.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineClientAuthenticationService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineAuthenticator.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/YARNDelegationTokenIdentifier.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/security/TestTimelineAuthenticationFilter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilter.java Timeline authentication filter should add support for proxy user Key: YARN-2676 URL: https://issues.apache.org/jira/browse/YARN-2676 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.6.0 Attachments: YARN-2676.1.patch, YARN-2676.2.patch, YARN-2676.3.patch, YARN-2676.4.patch, YARN-2676.5.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2710) RM HA tests failed intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2710: - Attachment: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt Attached log when I tried to run in local, [~jianhe], could you verify this? RM HA tests failed intermittently on trunk -- Key: YARN-2710 URL: https://issues.apache.org/jira/browse/YARN-2710 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Wangda Tan Attachments: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt Failure like, it can be happened in TestApplicationClientProtocolOnHA, TestResourceTrackerOnHA, etc. {code} org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) Time elapsed: 9.491 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583) at org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2699) Fix test timeout in TestResourceTrackerOnHA#testResourceTrackerOnHA
[ https://issues.apache.org/jira/browse/YARN-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176033#comment-14176033 ] Hudson commented on YARN-2699: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1930 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1930/]) YARN-2699. Fixed a bug in CommonNodeLabelsManager that caused tests to fail when using ephemeral ports on NodeIDs. Contributed by Wangda Tan. (vinodkv: rev abae63caf9c53b404f2f2db7d482555484eaeaf8) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/RMNodeLabelsManager.java Fix test timeout in TestResourceTrackerOnHA#testResourceTrackerOnHA --- Key: YARN-2699 URL: https://issues.apache.org/jira/browse/YARN-2699 Project: Hadoop YARN Issue Type: Sub-task Components: client Reporter: Wangda Tan Assignee: Wangda Tan Priority: Blocker Fix For: 2.6.0 Attachments: YARN-2699-20141016-1.patch, YARN-2699-20141016-2.patch, YARN-2699-20141016-3.patch, YARN-2699-20141016-4.patch Because of changes by YARN-2500/YARN-2496/YARN-2494, now registering a node manager with port=0 is not allowed. TestResourceTrackerOnHA#testResourceTrackerOnHA will be failed since it register a node manager with port = 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2705) Changes of RM node label manager default configuration
[ https://issues.apache.org/jira/browse/YARN-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176032#comment-14176032 ] Hudson commented on YARN-2705: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1930 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1930/]) YARN-2705. Fixed bugs in ResourceManager node-label manager that were causing test-failures: added a dummy in-memory labels-manager. Contributed by Wangda Tan. (vinodkv: rev e9c66e8fd2ccb658db2848e1ab911f1502de4de5) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/MemoryRMNodeLabelsManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/DummyRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/FileSystemNodeLabelsStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestContainerAllocation.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestFileSystemNodeLabelsStore.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Changes of RM node label manager default configuration -- Key: YARN-2705 URL: https://issues.apache.org/jira/browse/YARN-2705 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Fix For: 2.6.0 Attachments: YARN-2705-20141017-1.patch, YARN-2705-20141017-2.patch 1) Add yarn.node-labels.manager-class, by default it's will not store anything to file system 2) Use above at least in some places: RMNodeLabelsManager, RMAdminCLI. Convert {{DummyNodeLabelsManager}} into a {{MemoryNodeLabelsManager}} 3) Document that RM configs and client configs for yarn.node-labels.manager-class should match 4) fs-store.uri - fs-store.root-dir 5) Similarly FS_NODE_LABELS_STORE_URI 6) For default value of fs-store.uri, put it in /tmp. But creaat /tmp/hadoop-yarn-$\{user\}/node-labels -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2588) Standby RM does not transitionToActive if previous transitionToActive is failed with ZK exception.
[ https://issues.apache.org/jira/browse/YARN-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176030#comment-14176030 ] Hudson commented on YARN-2588: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1930 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1930/]) YARN-2588. Standby RM fails to transitionToActive if previous transitionToActive failed with ZK exception. Contributed by Rohith Sharmaks (jianhe: rev a6aa6e42cacdbfcc1c2b7c19e7239204fe9ff654) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Standby RM does not transitionToActive if previous transitionToActive is failed with ZK exception. -- Key: YARN-2588 URL: https://issues.apache.org/jira/browse/YARN-2588 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 3.0.0, 2.6.0, 2.5.1 Reporter: Rohith Assignee: Rohith Fix For: 2.6.0 Attachments: YARN-2588.1.patch, YARN-2588.2.patch, YARN-2588.patch Consider scenario where, StandBy RM is failed to transition to Active because of ZK exception(connectionLoss or SessionExpired). Then any further transition to Active for same RM does not move RM to Active state. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2676) Timeline authentication filter should add support for proxy user
[ https://issues.apache.org/jira/browse/YARN-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176035#comment-14176035 ] Hudson commented on YARN-2676: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1930 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1930/]) YARN-2676. Enhanced Timeline auth-filter to support proxy users. Contributed by Zhijie Shen. (vinodkv: rev 39063cd36f96e351e4a6bf0bc2b6185711d4b059) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineClientImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/YARNDelegationTokenIdentifier.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineDelegationTokenSecretManagerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineClientAuthenticationService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/security/TestTimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestTimelineAuthenticator.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryServer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/pom.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilterInitializer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServicesWithSSL.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/security/TestTimelineAuthenticationFilter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/api/impl/TimelineAuthenticator.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/timeline/security/TimelineAuthenticationFilter.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/metrics/TestSystemMetricsPublisher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/webapp/AHSWebApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/applicationhistoryservice/TestApplicationHistoryClientService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryServer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java Timeline authentication filter should add support for proxy user Key: YARN-2676 URL: https://issues.apache.org/jira/browse/YARN-2676 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Fix For: 2.6.0 Attachments: YARN-2676.1.patch, YARN-2676.2.patch, YARN-2676.3.patch, YARN-2676.4.patch, YARN-2676.5.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over
[ https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176029#comment-14176029 ] Hudson commented on YARN-1879: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1930 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1930/]) YARN-1879. Marked Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over. Contributed by Tsuyoshi OZAWA (jianhe: rev c3de2412eb7633ff16c67e71e73bbe27a982d984) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockAM.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationMasterProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java * hadoop-yarn-project/CHANGES.txt Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over Key: YARN-1879 URL: https://issues.apache.org/jira/browse/YARN-1879 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Tsuyoshi OZAWA Priority: Critical Fix For: 2.6.0 Attachments: YARN-1879.1.patch, YARN-1879.1.patch, YARN-1879.11.patch, YARN-1879.12.patch, YARN-1879.13.patch, YARN-1879.14.patch, YARN-1879.15.patch, YARN-1879.16.patch, YARN-1879.17.patch, YARN-1879.18.patch, YARN-1879.19.patch, YARN-1879.2-wip.patch, YARN-1879.2.patch, YARN-1879.20.patch, YARN-1879.21.patch, YARN-1879.22.patch, YARN-1879.23.patch, YARN-1879.23.patch, YARN-1879.24.patch, YARN-1879.25.patch, YARN-1879.26.patch, YARN-1879.27.patch, YARN-1879.28.patch, YARN-1879.29.patch, YARN-1879.3.patch, YARN-1879.4.patch, YARN-1879.5.patch, YARN-1879.6.patch, YARN-1879.7.patch, YARN-1879.8.patch, YARN-1879.9.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2504) Support get/add/remove/change labels in RM admin CLI
[ https://issues.apache.org/jira/browse/YARN-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176083#comment-14176083 ] Vinod Kumar Vavilapalli commented on YARN-2504: --- +1, looks good. Checking this in. Support get/add/remove/change labels in RM admin CLI - Key: YARN-2504 URL: https://issues.apache.org/jira/browse/YARN-2504 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Attachments: YARN-2504-20141015-1.patch, YARN-2504-20141016-1.patch, YARN-2504-20141016-2.patch, YARN-2504-20141016-3.patch, YARN-2504-20141017-1.patch, YARN-2504-20141017-2.patch, YARN-2504-20141017-3.patch, YARN-2504-20141017-4.patch, YARN-2504-20141017-4.patch, YARN-2504.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2504) Support get/add/remove/change labels in RM admin CLI
[ https://issues.apache.org/jira/browse/YARN-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176085#comment-14176085 ] Hudson commented on YARN-2504: -- FAILURE: Integrated in Hadoop-trunk-Commit #6288 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6288/]) YARN-2504. Enhanced RM Admin CLI to support management of node-labels. Contribyted by Wangda Tan. (vinodkv: rev 82567664988b673f1b819a42a4baf31cb0dcb331) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ResourceManagerAdministrationProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/resourcemanager_administration_protocol.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/DummyCommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/ResourceManagerAdministrationProtocolPBServiceImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/ResourceManagerAdministrationProtocolPBClientImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java * hadoop-yarn-project/CHANGES.txt Support get/add/remove/change labels in RM admin CLI - Key: YARN-2504 URL: https://issues.apache.org/jira/browse/YARN-2504 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Fix For: 2.6.0 Attachments: YARN-2504-20141015-1.patch, YARN-2504-20141016-1.patch, YARN-2504-20141016-2.patch, YARN-2504-20141016-3.patch, YARN-2504-20141017-1.patch, YARN-2504-20141017-2.patch, YARN-2504-20141017-3.patch, YARN-2504-20141017-4.patch, YARN-2504-20141017-4.patch, YARN-2504.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2701) Potential race condition in startLocalizer when using LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176202#comment-14176202 ] zhihai xu commented on YARN-2701: - thanks [~xgong], I want to discuss about check_permission: Do we need to check the directory permission? Before YARN-2161 patch was committed, the code didn't check the permission. {code} if (mkdirat(cwd, token, perm) != 0) { if (errno != EEXIST) { fprintf(LOGFILE, Can't create directory %s in %s - %s\n, token, path, strerror(errno)); close(cwd); free(buffer); return -1; } } {code} If we want to check permission, Can we change the permission if the permission doesn't match? {code} int check_permission(char* path, mode_t filePerm, mode_t desired) { int filePermInt = filePerm (S_IRWXU | S_IRWXG | S_IRWXO); int desiredInt = desired (S_IRWXU | S_IRWXG | S_IRWXO); if (filePermInt != desiredInt) { if (chmod(path, desiredInt | (filePerm (S_ISUID | S_ISGID | S_ISVTX))) != 0) { return -1; } } return 0; } {code} Potential race condition in startLocalizer when using LinuxContainerExecutor -- Key: YARN-2701 URL: https://issues.apache.org/jira/browse/YARN-2701 Project: Hadoop YARN Issue Type: Bug Reporter: Xuan Gong Assignee: Xuan Gong Priority: Blocker Attachments: YARN-2701.1.patch, YARN-2701.2.patch, YARN-2701.3.patch When using LinuxContainerExecutor do startLocalizer, we are using native code container-executor.c. {code} if (stat(npath, sb) != 0) { if (mkdir(npath, perm) != 0) { {code} We are using check and create method to create the appDir under /usercache. But if there are two containers trying to do this at the same time, race condition may happen. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2190) Provide a Windows container executor that can limit memory and CPU
[ https://issues.apache.org/jira/browse/YARN-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176209#comment-14176209 ] Ivan Mitic commented on YARN-2190: -- Thanks Chuan, the latest patch looks good to me. There were a few conflicting changes that went in, please resolve when you get a chance and we can move forward with the patch. Provide a Windows container executor that can limit memory and CPU -- Key: YARN-2190 URL: https://issues.apache.org/jira/browse/YARN-2190 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager Reporter: Chuan Liu Assignee: Chuan Liu Attachments: YARN-2190-prototype.patch, YARN-2190.1.patch, YARN-2190.2.patch, YARN-2190.3.patch, YARN-2190.4.patch, YARN-2190.5.patch, YARN-2190.6.patch Yarn default container executor on Windows does not set the resource limit on the containers currently. The memory limit is enforced by a separate monitoring thread. The container implementation on Windows uses Job Object right now. The latest Windows (8 or later) API allows CPU and memory limits on the job objects. We want to create a Windows container executor that sets the limits on job objects thus provides resource enforcement at OS level. http://msdn.microsoft.com/en-us/library/windows/desktop/ms686216(v=vs.85).aspx -- This message was sent by Atlassian JIRA (v6.3.4#6332)