[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176288#comment-14176288 ] Tsuyoshi OZAWA commented on YARN-2710: -- [~leftnoteasy], it passes on my local too. I checked the log you attached - it failed since EOFException occured. EOFException can happen with different protobuf format mixes. Could you retry the test after {{mvn clean}}? It sometimes resolves the problem. RM HA tests failed intermittently on trunk -- Key: YARN-2710 URL: https://issues.apache.org/jira/browse/YARN-2710 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Wangda Tan Attachments: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt Failure like, it can be happened in TestApplicationClientProtocolOnHA, TestResourceTrackerOnHA, etc. {code} org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) Time elapsed: 9.491 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583) at org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2504) Support get/add/remove/change labels in RM admin CLI
[ https://issues.apache.org/jira/browse/YARN-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176291#comment-14176291 ] Hudson commented on YARN-2504: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #717 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/717/]) YARN-2504. Enhanced RM Admin CLI to support management of node-labels. Contribyted by Wangda Tan. (vinodkv: rev 82567664988b673f1b819a42a4baf31cb0dcb331) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/DummyCommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/ResourceManagerAdministrationProtocolPBServiceImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/resourcemanager_administration_protocol.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ResourceManagerAdministrationProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/ResourceManagerAdministrationProtocolPBClientImpl.java * hadoop-yarn-project/CHANGES.txt Support get/add/remove/change labels in RM admin CLI - Key: YARN-2504 URL: https://issues.apache.org/jira/browse/YARN-2504 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Fix For: 2.6.0 Attachments: YARN-2504-20141015-1.patch, YARN-2504-20141016-1.patch, YARN-2504-20141016-2.patch, YARN-2504-20141016-3.patch, YARN-2504-20141017-1.patch, YARN-2504-20141017-2.patch, YARN-2504-20141017-3.patch, YARN-2504-20141017-4.patch, YARN-2504-20141017-4.patch, YARN-2504.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2504) Support get/add/remove/change labels in RM admin CLI
[ https://issues.apache.org/jira/browse/YARN-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176318#comment-14176318 ] Hudson commented on YARN-2504: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1906 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1906/]) YARN-2504. Enhanced RM Admin CLI to support management of node-labels. Contribyted by Wangda Tan. (vinodkv: rev 82567664988b673f1b819a42a4baf31cb0dcb331) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ResourceManagerAdministrationProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/DummyCommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/ResourceManagerAdministrationProtocolPBClientImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/ResourceManagerAdministrationProtocolPBServiceImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/resourcemanager_administration_protocol.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java Support get/add/remove/change labels in RM admin CLI - Key: YARN-2504 URL: https://issues.apache.org/jira/browse/YARN-2504 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Fix For: 2.6.0 Attachments: YARN-2504-20141015-1.patch, YARN-2504-20141016-1.patch, YARN-2504-20141016-2.patch, YARN-2504-20141016-3.patch, YARN-2504-20141017-1.patch, YARN-2504-20141017-2.patch, YARN-2504-20141017-3.patch, YARN-2504-20141017-4.patch, YARN-2504-20141017-4.patch, YARN-2504.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over
[ https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176320#comment-14176320 ] Tsuyoshi OZAWA commented on YARN-1879: -- Thanks Jian, Karthik, Vinod, Xuan and Anubhav for reviews and comments! Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol for RM fail over Key: YARN-1879 URL: https://issues.apache.org/jira/browse/YARN-1879 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Tsuyoshi OZAWA Priority: Critical Fix For: 2.6.0 Attachments: YARN-1879.1.patch, YARN-1879.1.patch, YARN-1879.11.patch, YARN-1879.12.patch, YARN-1879.13.patch, YARN-1879.14.patch, YARN-1879.15.patch, YARN-1879.16.patch, YARN-1879.17.patch, YARN-1879.18.patch, YARN-1879.19.patch, YARN-1879.2-wip.patch, YARN-1879.2.patch, YARN-1879.20.patch, YARN-1879.21.patch, YARN-1879.22.patch, YARN-1879.23.patch, YARN-1879.23.patch, YARN-1879.24.patch, YARN-1879.25.patch, YARN-1879.26.patch, YARN-1879.27.patch, YARN-1879.28.patch, YARN-1879.29.patch, YARN-1879.3.patch, YARN-1879.4.patch, YARN-1879.5.patch, YARN-1879.6.patch, YARN-1879.7.patch, YARN-1879.8.patch, YARN-1879.9.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2504) Support get/add/remove/change labels in RM admin CLI
[ https://issues.apache.org/jira/browse/YARN-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176323#comment-14176323 ] Hudson commented on YARN-2504: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1931 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1931/]) YARN-2504. Enhanced RM Admin CLI to support management of node-labels. Contribyted by Wangda Tan. (vinodkv: rev 82567664988b673f1b819a42a4baf31cb0dcb331) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/client/ResourceManagerAdministrationProtocolPBClientImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/impl/pb/service/ResourceManagerAdministrationProtocolPBServiceImpl.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/resourcemanager_administration_protocol.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/ResourceManagerAdministrationProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/DummyCommonNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMAdminCLI.java Support get/add/remove/change labels in RM admin CLI - Key: YARN-2504 URL: https://issues.apache.org/jira/browse/YARN-2504 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Priority: Critical Fix For: 2.6.0 Attachments: YARN-2504-20141015-1.patch, YARN-2504-20141016-1.patch, YARN-2504-20141016-2.patch, YARN-2504-20141016-3.patch, YARN-2504-20141017-1.patch, YARN-2504-20141017-2.patch, YARN-2504-20141017-3.patch, YARN-2504-20141017-4.patch, YARN-2504-20141017-4.patch, YARN-2504.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176324#comment-14176324 ] haosdent commented on YARN-2710: This case also pass in my local. I execute mvn clean test -Dtest=TestApplicationClientProtocolOnHA. RM HA tests failed intermittently on trunk -- Key: YARN-2710 URL: https://issues.apache.org/jira/browse/YARN-2710 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Wangda Tan Attachments: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt Failure like, it can be happened in TestApplicationClientProtocolOnHA, TestResourceTrackerOnHA, etc. {code} org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) Time elapsed: 9.491 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583) at org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml
[ https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176327#comment-14176327 ] bc Wong commented on YARN-2669: --- Thanks for the patch, Wei! What if the username has a period in it, and FS is configured to take the username as queue name? Is there a separate jira tracking that? FairScheduler: queueName shouldn't allow periods the allocation.xml --- Key: YARN-2669 URL: https://issues.apache.org/jira/browse/YARN-2669 Project: Hadoop YARN Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Priority: Minor Attachments: YARN-2669-1.patch For an allocation file like: {noformat} allocations queue name=root.q1 minResources4096mb,4vcores/minResources /queue /allocations {noformat} Users may wish to config minResources for a queue with full path root.q1. However, right now, fair scheduler will treat this configureation for the queue with full name root.root.q1. We need to print out a warning msg to notify users about this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2706) Math.abs() is called on random integer in DefaultContainerExecutor#getWorkingDir()
[ https://issues.apache.org/jira/browse/YARN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated YARN-2706: --- Attachment: YARN-2706.patch Math.abs() is called on random integer in DefaultContainerExecutor#getWorkingDir() -- Key: YARN-2706 URL: https://issues.apache.org/jira/browse/YARN-2706 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Priority: Minor Attachments: YARN-2706.patch Here is the code: {code} long randomPosition = Math.abs(r.nextLong()) % totalAvailable; {code} See http://stackoverflow.com/questions/7567350/findbugs-rv-absolute-value-of-random-int-warning -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2697) RMAuthenticationHandler is no longer useful
[ https://issues.apache.org/jira/browse/YARN-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] haosdent updated YARN-2697: --- Attachment: YARN-2697.patch RMAuthenticationHandler is no longer useful --- Key: YARN-2697 URL: https://issues.apache.org/jira/browse/YARN-2697 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Zhijie Shen Attachments: YARN-2697.patch After YARN-2656, RMAuthenticationHandler is no longer useful, because authentication mechanism is reusing the common DT auth filter stack. It should be safe to remove this unused code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2706) Math.abs() is called on random integer in DefaultContainerExecutor#getWorkingDir()
[ https://issues.apache.org/jira/browse/YARN-2706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176347#comment-14176347 ] Hadoop QA commented on YARN-2706: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12675726/YARN-2706.patch against trunk revision 8256766. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5455//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5455//console This message is automatically generated. Math.abs() is called on random integer in DefaultContainerExecutor#getWorkingDir() -- Key: YARN-2706 URL: https://issues.apache.org/jira/browse/YARN-2706 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Assignee: haosdent Priority: Minor Attachments: YARN-2706.patch Here is the code: {code} long randomPosition = Math.abs(r.nextLong()) % totalAvailable; {code} See http://stackoverflow.com/questions/7567350/findbugs-rv-absolute-value-of-random-int-warning -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS
[ https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176350#comment-14176350 ] haosdent commented on YARN-2681: If data is transfer from domain socket, we couldn't use tc. Support bandwidth enforcement for containers while reading from HDFS Key: YARN-2681 URL: https://issues.apache.org/jira/browse/YARN-2681 Project: Hadoop YARN Issue Type: New Feature Components: capacityscheduler, nodemanager, resourcemanager Affects Versions: 2.4.0 Environment: Linux Reporter: cntic Attachments: Traffic Control Design.png To read/write data from HDFS on data node, applications establise TCP/IP connections with the datanode. The HDFS read can be controled by setting Linux Traffic Control (TC) subsystem on the data node to make filters on appropriate connections. The current cgroups net_cls concept can not be applied on the node where the container is launched, netheir on data node since: - TC hanldes outgoing bandwidth only, so it can be set on container node (HDFS read = incoming data for the container) - Since HDFS data node is handled by only one process, it is not possible to use net_cls to separate connections from different containers to the datanode. Tasks: 1) Extend Resource model to define bandwidth enforcement rate 2) Monitor TCP/IP connection estabilised by container handling process and its child processes 3) Set Linux Traffic Control rules on data node base on address:port pairs in order to enforce bandwidth of outgoing data -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2669) FairScheduler: queueName shouldn't allow periods the allocation.xml
[ https://issues.apache.org/jira/browse/YARN-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176355#comment-14176355 ] Wei Yan commented on YARN-2669: --- [~bcwalrus]. I don't think we have a jira handling the period in username. Currently if username contains period, like A.B, fs treats as two queues root.A and root.A.B. Will fix it with this jira. FairScheduler: queueName shouldn't allow periods the allocation.xml --- Key: YARN-2669 URL: https://issues.apache.org/jira/browse/YARN-2669 Project: Hadoop YARN Issue Type: Improvement Reporter: Wei Yan Assignee: Wei Yan Priority: Minor Attachments: YARN-2669-1.patch For an allocation file like: {noformat} allocations queue name=root.q1 minResources4096mb,4vcores/minResources /queue /allocations {noformat} Users may wish to config minResources for a queue with full path root.q1. However, right now, fair scheduler will treat this configureation for the queue with full name root.root.q1. We need to print out a warning msg to notify users about this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2697) RMAuthenticationHandler is no longer useful
[ https://issues.apache.org/jira/browse/YARN-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176372#comment-14176372 ] Hadoop QA commented on YARN-2697: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12675727/YARN-2697.patch against trunk revision 8256766. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5456//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5456//console This message is automatically generated. RMAuthenticationHandler is no longer useful --- Key: YARN-2697 URL: https://issues.apache.org/jira/browse/YARN-2697 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Zhijie Shen Assignee: haosdent Attachments: YARN-2697.patch After YARN-2656, RMAuthenticationHandler is no longer useful, because authentication mechanism is reusing the common DT auth filter stack. It should be safe to remove this unused code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176421#comment-14176421 ] Tsuyoshi OZAWA commented on YARN-2710: -- BTW, YARN-2398 is addressing intermittent failure of TestResourceTrackerOnHA. RM HA tests failed intermittently on trunk -- Key: YARN-2710 URL: https://issues.apache.org/jira/browse/YARN-2710 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Wangda Tan Attachments: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt Failure like, it can be happened in TestApplicationClientProtocolOnHA, TestResourceTrackerOnHA, etc. {code} org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) Time elapsed: 9.491 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583) at org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-2710) RM HA tests failed intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA resolved YARN-2710. -- Resolution: Duplicate RM HA tests failed intermittently on trunk -- Key: YARN-2710 URL: https://issues.apache.org/jira/browse/YARN-2710 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Wangda Tan Attachments: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt Failure like, it can be happened in TestApplicationClientProtocolOnHA, TestResourceTrackerOnHA, etc. {code} org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) Time elapsed: 9.491 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583) at org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk
[ https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176424#comment-14176424 ] Tsuyoshi OZAWA commented on YARN-2710: -- Closing this issue as dup of YARN-2398. RM HA tests failed intermittently on trunk -- Key: YARN-2710 URL: https://issues.apache.org/jira/browse/YARN-2710 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Wangda Tan Attachments: org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt Failure like, it can be happened in TestApplicationClientProtocolOnHA, TestResourceTrackerOnHA, etc. {code} org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA) Time elapsed: 9.491 sec ERROR! java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 to asf905.gq1.ygridcore.net:28032 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705) at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521) at org.apache.hadoop.ipc.Client.call(Client.java:1438) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101) at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583) at org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2673) Add retry for timeline client put APIs
[ https://issues.apache.org/jira/browse/YARN-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-2673: Attachment: YARN-2673-101914.patch Hi [~zjshen], thanks for your review! I addressed your comments, and rebased the patch with the latest trunk. If you have time please feel free to take a look. Thanks! Add retry for timeline client put APIs -- Key: YARN-2673 URL: https://issues.apache.org/jira/browse/YARN-2673 Project: Hadoop YARN Issue Type: Sub-task Reporter: Li Lu Assignee: Li Lu Attachments: YARN-2673-101414-1.patch, YARN-2673-101414-2.patch, YARN-2673-101414.patch, YARN-2673-101714.patch, YARN-2673-101914.patch Timeline client now does not handle the case gracefully when the server is down. Jobs from distributed shell may fail due to ATS restart. We may need to add some retry mechanisms to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2673) Add retry for timeline client put APIs
[ https://issues.apache.org/jira/browse/YARN-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176523#comment-14176523 ] Hadoop QA commented on YARN-2673: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12675753/YARN-2673-101914.patch against trunk revision 7bbda6e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5457//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5457//console This message is automatically generated. Add retry for timeline client put APIs -- Key: YARN-2673 URL: https://issues.apache.org/jira/browse/YARN-2673 Project: Hadoop YARN Issue Type: Sub-task Reporter: Li Lu Assignee: Li Lu Attachments: YARN-2673-101414-1.patch, YARN-2673-101414-2.patch, YARN-2673-101414.patch, YARN-2673-101714.patch, YARN-2673-101914.patch Timeline client now does not handle the case gracefully when the server is down. Jobs from distributed shell may fail due to ATS restart. We may need to add some retry mechanisms to the client. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2704) Localization and log-aggregation will fail if hdfs delegation token expired after token-max-life-time
[ https://issues.apache.org/jira/browse/YARN-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176600#comment-14176600 ] Jian He commented on YARN-2704: --- bq. this is really a YARN responsibility and so it should automatically get tokens that are needed Agree, we should let YARN automatically request new hdfs token on behalf of the user for localization and log-aggregation. I'm creating a patch based on this idea. Localization and log-aggregation will fail if hdfs delegation token expired after token-max-life-time -- Key: YARN-2704 URL: https://issues.apache.org/jira/browse/YARN-2704 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Jian He In secure mode, YARN requires the hdfs-delegation token to do localization and log aggregation on behalf of the user. But the hdfs delegation token will eventually expire after max-token-life-time. So, localization and log aggregation will fail after the token expires. -- This message was sent by Atlassian JIRA (v6.3.4#6332)