[jira] [Commented] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM
[ https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864008#comment-13864008 ] Hadoop QA commented on YARN-1482: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621760/YARN-1482.6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2811//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2811//console This message is automatically generated. WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM - Key: YARN-1482 URL: https://issues.apache.org/jira/browse/YARN-1482 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, YARN-1482.4.patch, YARN-1482.4.patch, YARN-1482.5.patch, YARN-1482.5.patch, YARN-1482.6.patch This way, even if an RM goes to standby mode, we can affect a redirect to the active. And more importantly, users will not suddenly see all their links stop working. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1560) TestYarnClient#testAMMRTokens fails with null AMRM token
[ https://issues.apache.org/jira/browse/YARN-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864101#comment-13864101 ] Hudson commented on YARN-1560: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #445 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/445/]) YARN-1560. Fixed TestYarnClient#testAMMRTokens failure with null AMRM token. (Contributed by Ted Yu) (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1555975) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java TestYarnClient#testAMMRTokens fails with null AMRM token Key: YARN-1560 URL: https://issues.apache.org/jira/browse/YARN-1560 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Attachments: yarn-1560-v1.txt, yarn-1560-v2.txt The following can be reproduced locally: {code} testAMMRTokens(org.apache.hadoop.yarn.client.api.impl.TestYarnClient) Time elapsed: 3.341 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertNotNull(Assert.java:218) at junit.framework.Assert.assertNotNull(Assert.java:211) at org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testAMMRTokens(TestYarnClient.java:382) {code} This test didn't appear in https://builds.apache.org/job/Hadoop-Yarn-trunk/442/consoleFull -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM
[ https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864106#comment-13864106 ] Hudson commented on YARN-1029: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #445 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/445/]) YARN-1029. Added embedded leader election in the ResourceManager. Contributed by Karthik Kambatla. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556103) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/EmbeddedElectorService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMFatalEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMFatalEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreOperationFailedEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreOperationFailedEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java Allow embedding leader election into the RM --- Key: YARN-1029 URL: https://issues.apache.org/jira/browse/YARN-1029 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Fix For: 2.4.0 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-10.patch, yarn-1029-2.patch, yarn-1029-3.patch, yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch,
[jira] [Commented] (YARN-1559) Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE
[ https://issues.apache.org/jira/browse/YARN-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864103#comment-13864103 ] Hudson commented on YARN-1559: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #445 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/445/]) YARN-1559. Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE. (kasha and vinodkv via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1555970) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ServerRMProxy.java Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE - Key: YARN-1559 URL: https://issues.apache.org/jira/browse/YARN-1559 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Fix For: 2.4.0 Attachments: YARN-1559-20140105.txt, yarn-1559-1.patch, yarn-1559-2.patch, yarn-1559-3.patch RMProxy#INSTANCE is a non-final static field and both ServerRMProxy and ClientRMProxy set it. This leads to races as witnessed on - YARN-1482. Sample trace: {noformat} java.lang.IllegalArgumentException: RM does not support this client protocol at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.yarn.client.ClientRMProxy.checkAllowedProtocols(ClientRMProxy.java:119) at org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:58) at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:158) at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:88) at org.apache.hadoop.yarn.server.api.ServerRMProxy.createRMProxy(ServerRMProxy.java:56) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Wong updated YARN-1287: - Attachment: (was: YARN-1287-2.patch) Consolidate MockClocks -- Key: YARN-1287 URL: https://issues.apache.org/jira/browse/YARN-1287 Project: Hadoop YARN Issue Type: Improvement Reporter: Sandy Ryza Assignee: Sebastian Wong Labels: newbie Attachments: YARN-1287-3.patch A bunch of different tests have near-identical implementations of MockClock. TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Wong updated YARN-1287: - Attachment: YARN-1287-3.patch Consolidate MockClocks -- Key: YARN-1287 URL: https://issues.apache.org/jira/browse/YARN-1287 Project: Hadoop YARN Issue Type: Improvement Reporter: Sandy Ryza Assignee: Sebastian Wong Labels: newbie Attachments: YARN-1287-3.patch A bunch of different tests have near-identical implementations of MockClock. TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Wong updated YARN-1287: - Attachment: (was: YARN-1287-3.patch) Consolidate MockClocks -- Key: YARN-1287 URL: https://issues.apache.org/jira/browse/YARN-1287 Project: Hadoop YARN Issue Type: Improvement Reporter: Sandy Ryza Assignee: Sebastian Wong Labels: newbie A bunch of different tests have near-identical implementations of MockClock. TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Wong updated YARN-1287: - Attachment: YARN-1287-3.patch Consolidate MockClocks -- Key: YARN-1287 URL: https://issues.apache.org/jira/browse/YARN-1287 Project: Hadoop YARN Issue Type: Improvement Reporter: Sandy Ryza Assignee: Sebastian Wong Labels: newbie Attachments: YARN-1287-3.patch A bunch of different tests have near-identical implementations of MockClock. TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864161#comment-13864161 ] Hadoop QA commented on YARN-1287: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621778/YARN-1287-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2812//console This message is automatically generated. Consolidate MockClocks -- Key: YARN-1287 URL: https://issues.apache.org/jira/browse/YARN-1287 Project: Hadoop YARN Issue Type: Improvement Reporter: Sandy Ryza Assignee: Sebastian Wong Labels: newbie Attachments: YARN-1287-3.patch A bunch of different tests have near-identical implementations of MockClock. TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864170#comment-13864170 ] Hadoop QA commented on YARN-1287: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621781/YARN-1287-3.patch against trunk revision . {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2813//console This message is automatically generated. Consolidate MockClocks -- Key: YARN-1287 URL: https://issues.apache.org/jira/browse/YARN-1287 Project: Hadoop YARN Issue Type: Improvement Reporter: Sandy Ryza Assignee: Sebastian Wong Labels: newbie Attachments: YARN-1287-3.patch A bunch of different tests have near-identical implementations of MockClock. TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-896) Roll up for long-lived services in YARN
[ https://issues.apache.org/jira/browse/YARN-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864209#comment-13864209 ] Steve Loughran commented on YARN-896: - Link to YARN-1489: umbrella JAR for work preserving AM restart Roll up for long-lived services in YARN --- Key: YARN-896 URL: https://issues.apache.org/jira/browse/YARN-896 Project: Hadoop YARN Issue Type: New Feature Reporter: Robert Joseph Evans YARN is intended to be general purpose, but it is missing some features to be able to truly support long lived applications and long lived containers. This ticket is intended to # discuss what is needed to support long lived processes # track the resulting JIRA. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1489) [Umbrella] Work-preserving ApplicationMaster restart
[ https://issues.apache.org/jira/browse/YARN-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864212#comment-13864212 ] Steve Loughran commented on YARN-1489: -- regarding the rebinding problem, YARN-913 proposes some registry where we restrict the names of services and apps, and require uniqueness. This lets us register something like (hoya, stevel, accumulo5) and then let a client app look it up. Today we have the list of running apps, and you can find and bind to one, but # there's nothing to stop a single user having 1 instance of the same name # there's no way for a AM to enumerate this as the list operation isn't in the AMRM protocol [Umbrella] Work-preserving ApplicationMaster restart Key: YARN-1489 URL: https://issues.apache.org/jira/browse/YARN-1489 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: Work preserving AM restart.pdf Today if AMs go down, - RM kills all the containers of that ApplicationAttempt - New ApplicationAttempt doesn't know where the previous containers are running - Old running containers don't know where the new AM is running. We need to fix this to enable work-preserving AM restart. The later two potentially can be done at the app level, but it is good to have a common solution for all apps where-ever possible. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1560) TestYarnClient#testAMMRTokens fails with null AMRM token
[ https://issues.apache.org/jira/browse/YARN-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864218#comment-13864218 ] Hudson commented on YARN-1560: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1637/]) YARN-1560. Fixed TestYarnClient#testAMMRTokens failure with null AMRM token. (Contributed by Ted Yu) (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1555975) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java TestYarnClient#testAMMRTokens fails with null AMRM token Key: YARN-1560 URL: https://issues.apache.org/jira/browse/YARN-1560 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Attachments: yarn-1560-v1.txt, yarn-1560-v2.txt The following can be reproduced locally: {code} testAMMRTokens(org.apache.hadoop.yarn.client.api.impl.TestYarnClient) Time elapsed: 3.341 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertNotNull(Assert.java:218) at junit.framework.Assert.assertNotNull(Assert.java:211) at org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testAMMRTokens(TestYarnClient.java:382) {code} This test didn't appear in https://builds.apache.org/job/Hadoop-Yarn-trunk/442/consoleFull -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1559) Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE
[ https://issues.apache.org/jira/browse/YARN-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864220#comment-13864220 ] Hudson commented on YARN-1559: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1637/]) YARN-1559. Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE. (kasha and vinodkv via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1555970) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ServerRMProxy.java Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE - Key: YARN-1559 URL: https://issues.apache.org/jira/browse/YARN-1559 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Fix For: 2.4.0 Attachments: YARN-1559-20140105.txt, yarn-1559-1.patch, yarn-1559-2.patch, yarn-1559-3.patch RMProxy#INSTANCE is a non-final static field and both ServerRMProxy and ClientRMProxy set it. This leads to races as witnessed on - YARN-1482. Sample trace: {noformat} java.lang.IllegalArgumentException: RM does not support this client protocol at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.yarn.client.ClientRMProxy.checkAllowedProtocols(ClientRMProxy.java:119) at org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:58) at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:158) at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:88) at org.apache.hadoop.yarn.server.api.ServerRMProxy.createRMProxy(ServerRMProxy.java:56) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM
[ https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864223#comment-13864223 ] Hudson commented on YARN-1029: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1637 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1637/]) YARN-1029. Added embedded leader election in the ResourceManager. Contributed by Karthik Kambatla. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556103) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/EmbeddedElectorService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMFatalEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMFatalEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreOperationFailedEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreOperationFailedEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java Allow embedding leader election into the RM --- Key: YARN-1029 URL: https://issues.apache.org/jira/browse/YARN-1029 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Fix For: 2.4.0 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-10.patch, yarn-1029-2.patch, yarn-1029-3.patch, yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, yarn-1029-7.patch, yarn-1029-8.patch, yarn-1029-9.patch,
[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits
[ https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864237#comment-13864237 ] Steve Loughran commented on YARN-1490: -- How will the AM get notified of its existing containers? I can't seem to see this in the code. I can see the AM needing to know the following # that it has been restarted with containers retained # the list of the container allocations {{ListContainer liveContainers}}. # the list of containers that failed during the outage. {{ListContainer completedContainers}}. From that I can rebuild my model of the world (using container priorities to map to allocated roles) RM should optionally not kill all containers when an ApplicationMaster exits Key: YARN-1490 URL: https://issues.apache.org/jira/browse/YARN-1490 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Jian He Attachments: YARN-1490.1.patch, YARN-1490.2.patch, YARN-1490.3.patch This is needed to enable work-preserving AM restart. Some apps can chose to reconnect with old running containers, some may not want to. This should be an option. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1560) TestYarnClient#testAMMRTokens fails with null AMRM token
[ https://issues.apache.org/jira/browse/YARN-1560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864282#comment-13864282 ] Hudson commented on YARN-1560: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1662 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1662/]) YARN-1560. Fixed TestYarnClient#testAMMRTokens failure with null AMRM token. (Contributed by Ted Yu) (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1555975) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java TestYarnClient#testAMMRTokens fails with null AMRM token Key: YARN-1560 URL: https://issues.apache.org/jira/browse/YARN-1560 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: Ted Yu Attachments: yarn-1560-v1.txt, yarn-1560-v2.txt The following can be reproduced locally: {code} testAMMRTokens(org.apache.hadoop.yarn.client.api.impl.TestYarnClient) Time elapsed: 3.341 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertNotNull(Assert.java:218) at junit.framework.Assert.assertNotNull(Assert.java:211) at org.apache.hadoop.yarn.client.api.impl.TestYarnClient.testAMMRTokens(TestYarnClient.java:382) {code} This test didn't appear in https://builds.apache.org/job/Hadoop-Yarn-trunk/442/consoleFull -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1559) Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE
[ https://issues.apache.org/jira/browse/YARN-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864284#comment-13864284 ] Hudson commented on YARN-1559: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1662 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1662/]) YARN-1559. Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE. (kasha and vinodkv via kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1555970) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/dev-support/findbugs-exclude.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMProxy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ServerRMProxy.java Race between ServerRMProxy and ClientRMProxy setting RMProxy#INSTANCE - Key: YARN-1559 URL: https://issues.apache.org/jira/browse/YARN-1559 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Fix For: 2.4.0 Attachments: YARN-1559-20140105.txt, yarn-1559-1.patch, yarn-1559-2.patch, yarn-1559-3.patch RMProxy#INSTANCE is a non-final static field and both ServerRMProxy and ClientRMProxy set it. This leads to races as witnessed on - YARN-1482. Sample trace: {noformat} java.lang.IllegalArgumentException: RM does not support this client protocol at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.yarn.client.ClientRMProxy.checkAllowedProtocols(ClientRMProxy.java:119) at org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:58) at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:158) at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:88) at org.apache.hadoop.yarn.server.api.ServerRMProxy.createRMProxy(ServerRMProxy.java:56) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1029) Allow embedding leader election into the RM
[ https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864287#comment-13864287 ] Hudson commented on YARN-1029: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1662 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1662/]) YARN-1029. Added embedded leader election in the ResourceManager. Contributed by Karthik Kambatla. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556103) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ha/ActiveStandbyElector.java * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/HAUtil.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/RMHAServiceTarget.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/AdminService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/EmbeddedElectorService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMFatalEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMFatalEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreOperationFailedEvent.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreOperationFailedEventType.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStore.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestZKRMStateStoreZKClientConnections.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java Allow embedding leader election into the RM --- Key: YARN-1029 URL: https://issues.apache.org/jira/browse/YARN-1029 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Fix For: 2.4.0 Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, yarn-1029-0.patch, yarn-1029-1.patch, yarn-1029-10.patch, yarn-1029-2.patch, yarn-1029-3.patch, yarn-1029-4.patch, yarn-1029-5.patch, yarn-1029-6.patch, yarn-1029-7.patch, yarn-1029-7.patch, yarn-1029-8.patch,
[jira] [Commented] (YARN-1489) [Umbrella] Work-preserving ApplicationMaster restart
[ https://issues.apache.org/jira/browse/YARN-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864365#comment-13864365 ] Steve Loughran commented on YARN-1489: -- Actually, the simplest way for an AM to work with a restarted cluster would be if there was a blocking operation to list active containers. At startup it could get that list and use it to init its data structures -on a first start the list would be empty. Alternatively, the restart information could be passed down in {{RegisterApplicationMasterResponse}} -which would avoid adding any new RPC calls [Umbrella] Work-preserving ApplicationMaster restart Key: YARN-1489 URL: https://issues.apache.org/jira/browse/YARN-1489 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: Work preserving AM restart.pdf Today if AMs go down, - RM kills all the containers of that ApplicationAttempt - New ApplicationAttempt doesn't know where the previous containers are running - Old running containers don't know where the new AM is running. We need to fix this to enable work-preserving AM restart. The later two potentially can be done at the app level, but it is good to have a common solution for all apps where-ever possible. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1489) [Umbrella] Work-preserving ApplicationMaster restart
[ https://issues.apache.org/jira/browse/YARN-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864382#comment-13864382 ] Bikas Saha commented on YARN-1489: -- The POR is the attempt AMRM register RPC to return the currently running containers for that app. So when the attempt makes the initial sync with the RM then it will get all that info. [Umbrella] Work-preserving ApplicationMaster restart Key: YARN-1489 URL: https://issues.apache.org/jira/browse/YARN-1489 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: Work preserving AM restart.pdf Today if AMs go down, - RM kills all the containers of that ApplicationAttempt - New ApplicationAttempt doesn't know where the previous containers are running - Old running containers don't know where the new AM is running. We need to fix this to enable work-preserving AM restart. The later two potentially can be done at the app level, but it is good to have a common solution for all apps where-ever possible. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1531) Update yarn command document
[ https://issues.apache.org/jira/browse/YARN-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864390#comment-13864390 ] Karthik Kambatla commented on YARN-1531: [~ajisakaa], thanks for taking this up. Even though formatting the code to 80 chars per line is a good thing, it is probably better to limit those formatting changes to the actual text being changed. We can create a separate JIRA just for the formatting. Update yarn command document Key: YARN-1531 URL: https://issues.apache.org/jira/browse/YARN-1531 Project: Hadoop YARN Issue Type: Bug Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: documentaion Attachments: YARN-1531.patch There are some options which are not written to Yarn Command document. For example, yarn rmadmin command options are as follows: {code} Usage: yarn rmadmin -refreshQueues -refreshNodes -refreshSuperUserGroupsConfiguration -refreshUserToGroupsMappings -refreshAdminAcls -refreshServiceAcl -getGroups [username] -help [cmd] -transitionToActive serviceId -transitionToStandby serviceId -failover [--forcefence] [--forceactive] serviceId serviceId -getServiceState serviceId -checkHealth serviceId {code} But some of the new options such as -getGroups, -transitionToActive, and -transitionToStandby are not documented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1409) NonAggregatingLogHandler can throw RejectedExecutionException
[ https://issues.apache.org/jira/browse/YARN-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864392#comment-13864392 ] Jason Lowe commented on YARN-1409: -- +1 lgtm, committing this. A minor nit is that the org.junit.Assert import that was added to the test is unnecessary. Will clean this up during the commit. NonAggregatingLogHandler can throw RejectedExecutionException - Key: YARN-1409 URL: https://issues.apache.org/jira/browse/YARN-1409 Project: Hadoop YARN Issue Type: Bug Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: YARN-1409.1.patch, YARN-1409.2.patch, YARN-1409.3.patch This problem is caused by handling APPLICATION_FINISHED events after calling sched.shotdown() in NonAggregatingLongHandler#serviceStop(). org.apache.hadoop.mapred.TestJobCleanup can fail because of RejectedExecutionException by NonAggregatingLogHandler. {code} 2013-11-13 10:53:06,970 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(166)) - Error in dispatcher thread java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@d51df63 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@7a20e369[Shutting down, pool size = 4, active threads = 0, queued tasks = 7, completed tasks = 0] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325) at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530) at org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler.handle(NonAggregatingLogHandler.java:121) at org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler.handle(NonAggregatingLogHandler.java:49) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:159) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:95) at java.lang.Thread.run(Thread.java:724) {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1409) NonAggregatingLogHandler can throw RejectedExecutionException
[ https://issues.apache.org/jira/browse/YARN-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864417#comment-13864417 ] Hudson commented on YARN-1409: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4967 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4967/]) YARN-1409. NonAggregatingLogHandler can throw RejectedExecutionException. Contributed by Tsuyoshi OZAWA (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556282) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/loghandler/NonAggregatingLogHandler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/loghandler/TestNonAggregatingLogHandler.java NonAggregatingLogHandler can throw RejectedExecutionException - Key: YARN-1409 URL: https://issues.apache.org/jira/browse/YARN-1409 Project: Hadoop YARN Issue Type: Bug Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.4.0 Attachments: YARN-1409.1.patch, YARN-1409.2.patch, YARN-1409.3.patch This problem is caused by handling APPLICATION_FINISHED events after calling sched.shotdown() in NonAggregatingLongHandler#serviceStop(). org.apache.hadoop.mapred.TestJobCleanup can fail because of RejectedExecutionException by NonAggregatingLogHandler. {code} 2013-11-13 10:53:06,970 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(166)) - Error in dispatcher thread java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@d51df63 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@7a20e369[Shutting down, pool size = 4, active threads = 0, queued tasks = 7, completed tasks = 0] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325) at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530) at org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler.handle(NonAggregatingLogHandler.java:121) at org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler.handle(NonAggregatingLogHandler.java:49) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:159) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:95) at java.lang.Thread.run(Thread.java:724) {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1409) NonAggregatingLogHandler can throw RejectedExecutionException
[ https://issues.apache.org/jira/browse/YARN-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864447#comment-13864447 ] Tsuyoshi OZAWA commented on YARN-1409: -- Thank you, Jason! NonAggregatingLogHandler can throw RejectedExecutionException - Key: YARN-1409 URL: https://issues.apache.org/jira/browse/YARN-1409 Project: Hadoop YARN Issue Type: Bug Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.4.0 Attachments: YARN-1409.1.patch, YARN-1409.2.patch, YARN-1409.3.patch This problem is caused by handling APPLICATION_FINISHED events after calling sched.shotdown() in NonAggregatingLongHandler#serviceStop(). org.apache.hadoop.mapred.TestJobCleanup can fail because of RejectedExecutionException by NonAggregatingLogHandler. {code} 2013-11-13 10:53:06,970 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(166)) - Error in dispatcher thread java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@d51df63 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@7a20e369[Shutting down, pool size = 4, active threads = 0, queued tasks = 7, completed tasks = 0] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:325) at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:530) at org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler.handle(NonAggregatingLogHandler.java:121) at org.apache.hadoop.yarn.server.nodemanager.containermanager.loghandler.NonAggregatingLogHandler.handle(NonAggregatingLogHandler.java:49) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:159) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:95) at java.lang.Thread.run(Thread.java:724) {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1293) TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864481#comment-13864481 ] Tsuyoshi OZAWA commented on YARN-1293: -- Thank you for the comment, Akira. [~jianhe], can you merge a latest patch? TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk -- Key: YARN-1293 URL: https://issues.apache.org/jira/browse/YARN-1293 Project: Hadoop YARN Issue Type: Bug Environment: linux Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.3.0 Attachments: YARN-1293.1.patch {quote} --- Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch --- Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.655 sec FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch testInvalidEnvSyntaxDiagnostics(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 0.114 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testInvalidEnvSyntaxDiagnostics(TestContainerLaunch.java:273) {quote} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1326) RM should log using RMStore at startup time
[ https://issues.apache.org/jira/browse/YARN-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864473#comment-13864473 ] Tsuyoshi OZAWA commented on YARN-1326: -- I assume that a user can forget to configure RMStateStore and use unexpected RMStateStore, because MemoryRMStateStore is the default value of RMStateStoreFactory#getStore(). {code} public class RMStateStoreFactory { public static RMStateStore getStore(Configuration conf) { RMStateStore store = ReflectionUtils.newInstance( conf.getClass(YarnConfiguration.RM_STORE, MemoryRMStateStore.class, RMStateStore.class), conf); return store; } } {code} RM should log using RMStore at startup time --- Key: YARN-1326 URL: https://issues.apache.org/jira/browse/YARN-1326 Project: Hadoop YARN Issue Type: Sub-task Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: YARN-1326.1.patch Original Estimate: 3h Remaining Estimate: 3h Currently there are no way to know which RMStore RM uses. It's useful to log the information at RM's startup time. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission
[ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864516#comment-13864516 ] Xuan Gong commented on YARN-1410: - Have offline discussion with Bikas and Vinod. The approach we will use is to make RM accept the appId in the context. Assume that RM1 assign an applicationId, say Application_12345_1. Before the app is accepted, the failover happens. Now, RM2 becomes active, the RM2 will re-use the same applicationId Application_12345_1 (instead of assigning a new appId) to submitApplication. Handle client failover during 2 step client API's like app submission - Key: YARN-1410 URL: https://issues.apache.org/jira/browse/YARN-1410 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-1410.1.patch App submission involves 1) creating appId 2) using that appId to submit an ApplicationSubmissionContext to the user. The client may have obtained an appId from an RM, the RM may have failed over, and the client may submit the app to the new RM. Since the new RM has a different notion of cluster timestamp (used to create app id) the new RM may reject the app submission resulting in unexpected failure on the client side. The same may happen for other 2 step client API operations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1293) TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864530#comment-13864530 ] Jian He commented on YARN-1293: --- Thanks Akira for verifying , +1, committing it. TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk -- Key: YARN-1293 URL: https://issues.apache.org/jira/browse/YARN-1293 Project: Hadoop YARN Issue Type: Bug Environment: linux Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.3.0 Attachments: YARN-1293.1.patch {quote} --- Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch --- Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.655 sec FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch testInvalidEnvSyntaxDiagnostics(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 0.114 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testInvalidEnvSyntaxDiagnostics(TestContainerLaunch.java:273) {quote} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission
[ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864540#comment-13864540 ] Karthik Kambatla commented on YARN-1410: That makes sense. I am on board too. Handle client failover during 2 step client API's like app submission - Key: YARN-1410 URL: https://issues.apache.org/jira/browse/YARN-1410 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-1410.1.patch App submission involves 1) creating appId 2) using that appId to submit an ApplicationSubmissionContext to the user. The client may have obtained an appId from an RM, the RM may have failed over, and the client may submit the app to the new RM. Since the new RM has a different notion of cluster timestamp (used to create app id) the new RM may reject the app submission resulting in unexpected failure on the client side. The same may happen for other 2 step client API operations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1492) truly shared cache for jars (jobjar/libjar)
[ https://issues.apache.org/jira/browse/YARN-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864557#comment-13864557 ] Sangjin Lee commented on YARN-1492: --- Thanks for the comments [~kkambatl]! bq. In the client protocol, if a cleaner instance (or run) starts after R2 and before R2', the client wouldn't know of this cleaner's existence. That's why step R1 exists. Since the client lock is dropped *before* the client inspects the cleaner lock, even if the cleaner starts between R2 and R2' the cleaner simply skips this entry in favor of the client. Having said that, we are currently looking at the design again to better address the issue of security and other aspects. So it is likely some of these design choices may be revisited. truly shared cache for jars (jobjar/libjar) --- Key: YARN-1492 URL: https://issues.apache.org/jira/browse/YARN-1492 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.0.4-alpha Reporter: Sangjin Lee Assignee: Sangjin Lee Attachments: shared_cache_design.pdf, shared_cache_design_v2.pdf, shared_cache_design_v3.pdf, shared_cache_design_v4.pdf Currently there is the distributed cache that enables you to cache jars and files so that attempts from the same job can reuse them. However, sharing is limited with the distributed cache because it is normally on a per-job basis. On a large cluster, sometimes copying of jobjars and libjars becomes so prevalent that it consumes a large portion of the network bandwidth, not to speak of defeating the purpose of bringing compute to where data is. This is wasteful because in most cases code doesn't change much across many jobs. I'd like to propose and discuss feasibility of introducing a truly shared cache so that multiple jobs from multiple users can share and cache jars. This JIRA is to open the discussion. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1293) TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864560#comment-13864560 ] Jian He commented on YARN-1293: --- Committed to trunk and branch-2, thanks Tsuyoshi ! TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk -- Key: YARN-1293 URL: https://issues.apache.org/jira/browse/YARN-1293 Project: Hadoop YARN Issue Type: Bug Environment: linux Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.3.0 Attachments: YARN-1293.1.patch {quote} --- Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch --- Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.655 sec FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch testInvalidEnvSyntaxDiagnostics(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 0.114 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testInvalidEnvSyntaxDiagnostics(TestContainerLaunch.java:273) {quote} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1520) update capacity scheduler docs to include necessary parameters
[ https://issues.apache.org/jira/browse/YARN-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated YARN-1520: -- Attachment: yarn-1520 update capacity scheduler docs to include necessary parameters -- Key: YARN-1520 URL: https://issues.apache.org/jira/browse/YARN-1520 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta, 0.23.9 Reporter: Chen He Assignee: Chen He Labels: documentation, newbie Attachments: yarn-1520 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1293) TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864566#comment-13864566 ] Hudson commented on YARN-1293: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4968 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4968/]) YARN-1293. Fixed TestContainerLaunch#testInvalidEnvSyntaxDiagnostics failure caused by non-English system locale. Contributed by Tsuyoshi OZAWA. (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556318) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk -- Key: YARN-1293 URL: https://issues.apache.org/jira/browse/YARN-1293 Project: Hadoop YARN Issue Type: Bug Environment: linux Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.3.0 Attachments: YARN-1293.1.patch {quote} --- Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch --- Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.655 sec FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch testInvalidEnvSyntaxDiagnostics(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 0.114 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testInvalidEnvSyntaxDiagnostics(TestContainerLaunch.java:273) {quote} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits
[ https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864571#comment-13864571 ] Jian He commented on YARN-1490: --- bq. the list of containers that failed during the outage. ListContainer completedContainers. RMAppImpl.AttemptFailedTransition transition is retrieving those. bq. the list of the container allocations ListContainer liveContainers. SchedulerApplicationAttempt.recover() Beyond this patch, there's more AM protocol change patch, I have a local patch and will upload once this gets in. RM should optionally not kill all containers when an ApplicationMaster exits Key: YARN-1490 URL: https://issues.apache.org/jira/browse/YARN-1490 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Jian He Attachments: YARN-1490.1.patch, YARN-1490.2.patch, YARN-1490.3.patch This is needed to enable work-preserving AM restart. Some apps can chose to reconnect with old running containers, some may not want to. This should be an option. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1293) TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864574#comment-13864574 ] Tsuyoshi OZAWA commented on YARN-1293: -- Thanks Jian! TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk -- Key: YARN-1293 URL: https://issues.apache.org/jira/browse/YARN-1293 Project: Hadoop YARN Issue Type: Bug Environment: linux Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.3.0 Attachments: YARN-1293.1.patch {quote} --- Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch --- Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.655 sec FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch testInvalidEnvSyntaxDiagnostics(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 0.114 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testInvalidEnvSyntaxDiagnostics(TestContainerLaunch.java:273) {quote} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission
[ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864578#comment-13864578 ] Jian He commented on YARN-1410: --- Is it possible for RM2 to already have an existing conflicting applicationId compared to the one from RM1? Handle client failover during 2 step client API's like app submission - Key: YARN-1410 URL: https://issues.apache.org/jira/browse/YARN-1410 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-1410.1.patch App submission involves 1) creating appId 2) using that appId to submit an ApplicationSubmissionContext to the user. The client may have obtained an appId from an RM, the RM may have failed over, and the client may submit the app to the new RM. Since the new RM has a different notion of cluster timestamp (used to create app id) the new RM may reject the app submission resulting in unexpected failure on the client side. The same may happen for other 2 step client API operations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1520) update capacity scheduler docs to include necessary parameters
[ https://issues.apache.org/jira/browse/YARN-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864590#comment-13864590 ] Hadoop QA commented on YARN-1520: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621835/yarn-1520 against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2814//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2814//console This message is automatically generated. update capacity scheduler docs to include necessary parameters -- Key: YARN-1520 URL: https://issues.apache.org/jira/browse/YARN-1520 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.1.0-beta, 0.23.9 Reporter: Chen He Assignee: Chen He Labels: documentation, newbie Attachments: yarn-1520 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs
[ https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864733#comment-13864733 ] Karthik Kambatla commented on YARN-1461: Thanks for taking a look, [~zjshen]. bq. How about making the two constants configurable? As discussed earlier on YARN-1399, I think we should leave them as constants for now and create configs when we really think we need them. bq. Should ApplicationSubmissionContext#newInstance have String[] tags as well? Same for ApplicationReport and GetApplicationsRequest. Or you didn't do it on purpose for sake of compatibility? If so, I'm just feeling we're going to have more newInstance methods that cannot cover all the fields the objects should have. Intentionally left them out. IMO, there should be a single newInstance method to create the instance and then setters be used to actually set the fields - builder pattern. bq. Should we consider both case-sensitive and -insensitive, and both AND and OR logic? It would be unnecessarily complicating things. Again, as people have suggested on YARN-1399, case-insensitive and OR should address most cases, at least first-cut users can handle the AND. We can support AND in the future. RM API and RM changes to handle tags for running jobs - Key: YARN-1461 URL: https://issues.apache.org/jira/browse/YARN-1461 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, yarn-1461-7.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1461) RM API and RM changes to handle tags for running jobs
[ https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1461: --- Attachment: yarn-1461-8.patch New patch includes a new field in GetApplicationsRequest - Scope - to capture the scope of apps to be returned, the default value being OWN apps. For compatibility with 2.2, I have updated newInstance() methods to set it to ALL apps. RM API and RM changes to handle tags for running jobs - Key: YARN-1461 URL: https://issues.apache.org/jira/browse/YARN-1461 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, yarn-1461-7.patch, yarn-1461-8.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864748#comment-13864748 ] Alejandro Abdelnur commented on YARN-888: - [~rvs], mind if i take this JIRA from you? (besides being a cool JIRA number to own) the current dependencies in Yarn POMs are breaking intellij integration and this is kind of driving me crazy and I took a stub this morning and have a working patch. clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Roman Shaposhnik Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM
[ https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864754#comment-13864754 ] Vinod Kumar Vavilapalli commented on YARN-1482: --- +1, looks good. Checking this in. WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM - Key: YARN-1482 URL: https://issues.apache.org/jira/browse/YARN-1482 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, YARN-1482.4.patch, YARN-1482.4.patch, YARN-1482.5.patch, YARN-1482.5.patch, YARN-1482.6.patch This way, even if an RM goes to standby mode, we can affect a redirect to the active. And more importantly, users will not suddenly see all their links stop working. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.
[ https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864759#comment-13864759 ] Jian He commented on YARN-1506: --- - Instead of the check here, I think we can send the event and make RMNodeTransition to ignore this event. This can prevent the case that isUnusable return true right before the node is about to become usable, since the events will be processed sequentially. {code} else if (node.getState().isUnusable()) { LOG.warn(Resource update get failed on an unusable node: + nodeId); {code} - Did we have an overall test for testing AdminService to send the request and verify RMNode and schedulerNode are changed accordingly? Patch looks good to me mostly, [~bikassaha]/ [~vinodkv] you may also want to take a look. Replace set resource change on RMNode/SchedulerNode directly with event notification. - Key: YARN-1506 URL: https://issues.apache.org/jira/browse/YARN-1506 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Priority: Blocker Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, YARN-1506-v5.patch According to Vinod's comments on YARN-312 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087), we should replace RMNode.setResourceOption() with some resource change event. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM
[ https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864763#comment-13864763 ] Hudson commented on YARN-1482: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4970 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4970/]) YARN-1482. Modified WebApplicationProxy to make it work across ResourceManager fail-over. Contributed by Xuan Gong. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556380) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/AppReportFetcher.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/src/main/java/org/apache/hadoop/yarn/server/webproxy/WebAppProxy.java WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM - Key: YARN-1482 URL: https://issues.apache.org/jira/browse/YARN-1482 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong Fix For: 2.4.0 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, YARN-1482.4.patch, YARN-1482.4.patch, YARN-1482.5.patch, YARN-1482.5.patch, YARN-1482.6.patch This way, even if an RM goes to standby mode, we can affect a redirect to the active. And more importantly, users will not suddenly see all their links stop working. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1399) Allow users to annotate an application with multiple tags
[ https://issues.apache.org/jira/browse/YARN-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864769#comment-13864769 ] Sandy Ryza commented on YARN-1399: -- Adding a field to GetApplicationsRequest whose default limits what's returned by GetApplicationsRequest would be an incompatible change. -1 to that. Allow users to annotate an application with multiple tags - Key: YARN-1399 URL: https://issues.apache.org/jira/browse/YARN-1399 Project: Hadoop YARN Issue Type: Improvement Reporter: Zhijie Shen Assignee: Zhijie Shen Nowadays, when submitting an application, users can fill the applicationType field to facilitate searching it later. IMHO, it's good to accept multiple tags to allow users to describe their applications in multiple aspects, including the application type. Then, searching by tags may be more efficient for users to reach their desired application collection. It's pretty much like the tag system of online photo/video/music and etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1568) Rename clusterid to clusterId in ActiveRMInfoProto
[ https://issues.apache.org/jira/browse/YARN-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864775#comment-13864775 ] Sandy Ryza commented on YARN-1568: -- +1 Rename clusterid to clusterId in ActiveRMInfoProto --- Key: YARN-1568 URL: https://issues.apache.org/jira/browse/YARN-1568 Project: Hadoop YARN Issue Type: Task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Trivial Attachments: yarn-1568-1.patch YARN-1029 introduces ActiveRMInfoProto - just realized it defines a field clusterid, which is inconsistent with other fields. Better to fix it immediately than leave the inconsistency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1496) Protocol additions to allow moving apps between queues
[ https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1496: - Attachment: YARN-1496-4.patch Protocol additions to allow moving apps between queues -- Key: YARN-1496 URL: https://issues.apache.org/jira/browse/YARN-1496 Project: Hadoop YARN Issue Type: Sub-task Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1496-1.patch, YARN-1496-2.patch, YARN-1496-3.patch, YARN-1496-4.patch, YARN-1496.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.
[ https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864806#comment-13864806 ] Jian He commented on YARN-1506: --- bq. REBOOT - RUNNING for a rebooted node come back as running for accepting RECONNECTED/CLEAN_CONTAINER/AP not sure about this. A restart node seems only trigger the RECONNECT event on register and RMNode stays on RUNNING when receiving this event. bq. DECOMMISSIONED - RUNNING for a decommissioned node be recommissoned again simply because we are not supporting recommission ? bq. LOST - NEW/UNHELATHY/DECOMMISSONED for a expired node heartbeat again from the code, I can see the node is actually gone from RM's point of view once the node expires bq. UNHEALTHY - RUNNING for a unhealthy node report to be healthy again this is handled Replace set resource change on RMNode/SchedulerNode directly with event notification. - Key: YARN-1506 URL: https://issues.apache.org/jira/browse/YARN-1506 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Priority: Blocker Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, YARN-1506-v5.patch According to Vinod's comments on YARN-312 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087), we should replace RMNode.setResourceOption() with some resource change event. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1461) RM API and RM changes to handle tags for running jobs
[ https://issues.apache.org/jira/browse/YARN-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864822#comment-13864822 ] Hadoop QA commented on YARN-1461: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621867/yarn-1461-8.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestApplicationACLs {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2815//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2815//console This message is automatically generated. RM API and RM changes to handle tags for running jobs - Key: YARN-1461 URL: https://issues.apache.org/jira/browse/YARN-1461 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.2.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-1461-1.patch, yarn-1461-2.patch, yarn-1461-3.patch, yarn-1461-4.patch, yarn-1461-5.patch, yarn-1461-6.patch, yarn-1461-6.patch, yarn-1461-7.patch, yarn-1461-8.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1496) Protocol additions to allow moving apps between queues
[ https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864849#comment-13864849 ] Sandy Ryza commented on YARN-1496: -- bq. 'Move' doesn't seem informative enough. Good point. How does ChangeApplicationQueue sound to you? bq. Also, we should not mark APIs stable till they are, well, stable. Let's mark them unstable to begin with. APIs marked stable can still change before they are included in a release, right? By marking them stable I mean that once we include them in a release they shouldn't be able to change. Only committing to trunk at this time to ensure they're not included in a release accidentally. Protocol additions to allow moving apps between queues -- Key: YARN-1496 URL: https://issues.apache.org/jira/browse/YARN-1496 Project: Hadoop YARN Issue Type: Sub-task Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1496-1.patch, YARN-1496-2.patch, YARN-1496-3.patch, YARN-1496-4.patch, YARN-1496.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur reassigned YARN-888: --- Assignee: Alejandro Abdelnur (was: Roman Shaposhnik) Thanks Roman, I'll be posting the patch momentarily. If you have time to review it, it would be great. clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated YARN-888: Attachment: YARN-888.patch The patch moves all the dependencies to the leaf projects declaring explicitly what the module needs (used the dependency:analyze plugin to zero on that, commented in the POMs the dependencies not caught by the plugin as used). I've also did a DIST build and verified the JARs in the DIST are all the same (with the exception of the yarn-site JAR which is no more, the project for it is of type 'pom'). I've also verified Intellij now works fine compiling and running testcases. clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-888.patch Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864892#comment-13864892 ] Karthik Kambatla commented on YARN-1033: Hey [~nemon]. Are you still planning to work on this? Otherwise, I would like to take a stab at it. Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Nemon Lou Assignee: Nemon Lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1568) Rename clusterid to clusterId in ActiveRMInfoProto
[ https://issues.apache.org/jira/browse/YARN-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864902#comment-13864902 ] Hadoop QA commented on YARN-1568: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621871/yarn-1568-1.patch against trunk revision . {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2817//console This message is automatically generated. Rename clusterid to clusterId in ActiveRMInfoProto --- Key: YARN-1568 URL: https://issues.apache.org/jira/browse/YARN-1568 Project: Hadoop YARN Issue Type: Task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Trivial Attachments: yarn-1568-1.patch YARN-1029 introduces ActiveRMInfoProto - just realized it defines a field clusterid, which is inconsistent with other fields. Better to fix it immediately than leave the inconsistency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1568) Rename clusterid to clusterId in ActiveRMInfoProto
[ https://issues.apache.org/jira/browse/YARN-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1568: --- Attachment: yarn-1568-1.patch Re-submitting patch. Rename clusterid to clusterId in ActiveRMInfoProto --- Key: YARN-1568 URL: https://issues.apache.org/jira/browse/YARN-1568 Project: Hadoop YARN Issue Type: Task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Trivial Attachments: yarn-1568-1.patch, yarn-1568-1.patch YARN-1029 introduces ActiveRMInfoProto - just realized it defines a field clusterid, which is inconsistent with other fields. Better to fix it immediately than leave the inconsistency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864909#comment-13864909 ] Hadoop QA commented on YARN-888: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621881/YARN-888.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy: org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesApps org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServicesContainers org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices The test build failed in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2816//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2816//console This message is automatically generated. clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-888.patch Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1410) Handle client failover during 2 step client API's like app submission
[ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1410: -- Remaining Estimate: 48h Original Estimate: 48h Handle client failover during 2 step client API's like app submission - Key: YARN-1410 URL: https://issues.apache.org/jira/browse/YARN-1410 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-1410.1.patch Original Estimate: 48h Remaining Estimate: 48h App submission involves 1) creating appId 2) using that appId to submit an ApplicationSubmissionContext to the user. The client may have obtained an appId from an RM, the RM may have failed over, and the client may submit the app to the new RM. Since the new RM has a different notion of cluster timestamp (used to create app id) the new RM may reject the app submission resulting in unexpected failure on the client side. The same may happen for other 2 step client API operations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1041) RM to bind and notify a restarted AM of existing containers
[ https://issues.apache.org/jira/browse/YARN-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1041: -- Attachment: YARN-1041.1.patch Uploaded a patch for changing AM protocols to get the previous running containers on registration. The uploaded patch is based on YARN-1490 and may not apply locally for now. Just to give an early view of the patch. RM to bind and notify a restarted AM of existing containers --- Key: YARN-1041 URL: https://issues.apache.org/jira/browse/YARN-1041 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Assignee: Jian He Attachments: YARN-1041.1.patch For long lived containers we don't want the AM to be a SPOF. When the RM restarts a (failed) AM, it should be given the list of containers it had already been allocated. the AM should then be able to contact the NMs to get details on them. NMs would also need to do any binding of the containers needed to handle a moved/restarted AM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1041) Protocol changes for RM to bind and notify a restarted AM of existing containers
[ https://issues.apache.org/jira/browse/YARN-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1041: -- Summary: Protocol changes for RM to bind and notify a restarted AM of existing containers (was: RM to bind and notify a restarted AM of existing containers) Protocol changes for RM to bind and notify a restarted AM of existing containers Key: YARN-1041 URL: https://issues.apache.org/jira/browse/YARN-1041 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Assignee: Jian He Attachments: YARN-1041.1.patch For long lived containers we don't want the AM to be a SPOF. When the RM restarts a (failed) AM, it should be given the list of containers it had already been allocated. the AM should then be able to contact the NMs to get details on them. NMs would also need to do any binding of the containers needed to handle a moved/restarted AM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1568) Rename clusterid to clusterId in ActiveRMInfoProto
[ https://issues.apache.org/jira/browse/YARN-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864934#comment-13864934 ] Hadoop QA commented on YARN-1568: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621887/yarn-1568-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2818//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2818//console This message is automatically generated. Rename clusterid to clusterId in ActiveRMInfoProto --- Key: YARN-1568 URL: https://issues.apache.org/jira/browse/YARN-1568 Project: Hadoop YARN Issue Type: Task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Trivial Attachments: yarn-1568-1.patch, yarn-1568-1.patch YARN-1029 introduces ActiveRMInfoProto - just realized it defines a field clusterid, which is inconsistent with other fields. Better to fix it immediately than leave the inconsistency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated YARN-1033: Assignee: Karthik Kambatla (was: Nemon Lou) Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Nemon Lou Assignee: Karthik Kambatla Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864948#comment-13864948 ] Nemon Lou commented on YARN-1033: - Hi,Karthik Kambatla .Feel free to take it. : ) Thanks Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Nemon Lou Assignee: Nemon Lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864969#comment-13864969 ] Jian He commented on YARN-1166: --- patch looks good overall. - In FairScheduler, log if application == null ? {code} private synchronized void removeApplication(ApplicationId applicationId, RMAppState finalState) { SchedulerApplication application = applications.get(applicationId); if (application == null){ return; } {code} - There are things other than queue metrics. For example, LeafQueue.activeApplications and PendingApplications. These two are actually recording the attempts. But I remember those two are exposed on scheduler UI as schedulable and non-schedulable apps. Can you check if these two collections are also needed be associated with application ? YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Zhijie Shen Priority: Blocker Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, YARN-1166.5.patch, YARN-1166.6.patch, YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1568) Rename clusterid to clusterId in ActiveRMInfoProto
[ https://issues.apache.org/jira/browse/YARN-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864987#comment-13864987 ] Karthik Kambatla commented on YARN-1568: Didn't add tests as the patch just changes field name. Thanks for the review, Sandy. Will commit this later today. Rename clusterid to clusterId in ActiveRMInfoProto --- Key: YARN-1568 URL: https://issues.apache.org/jira/browse/YARN-1568 Project: Hadoop YARN Issue Type: Task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Trivial Attachments: yarn-1568-1.patch, yarn-1568-1.patch YARN-1029 introduces ActiveRMInfoProto - just realized it defines a field clusterid, which is inconsistent with other fields. Better to fix it immediately than leave the inconsistency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.
[ https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864989#comment-13864989 ] Junping Du commented on YARN-1506: -- Hi [~jianhe], Thanks again for your review and comments: bq. Instead of the check here, I think we can send the event and make RMNodeTransition to ignore this event. This can prevent the case that isUnusable return true right before the node is about to become usable, since the events will be processed sequentially. Good point. We should just let event mechanism to handle this concurrent issue. bq. Did we have an overall test for testing AdminService to send the request and verify RMNode and schedulerNode are changed accordingly? No system test yet with this patch but just some unit tests. However, I did some integration tests on previous patches in YARN-291 with a raw patch of YARN-313 (patch with admin CLI) and found it works well. More integration tests will come with YARN-313 (the next and last patch on YARN-291 that target for 2.4 branch). Make sense? bq. [REBOOT - RUNNING] not sure about this. A restart node seems only trigger the RECONNECT event on register and RMNode stays on RUNNING when receiving this event. The interesting thing here is DeactivateNodeTransition will be trigged from RUNNING - REBOOT, so node will be removed from RMContext.nodes and put to RMContext.inactiveNodes. So for next time registration, the event is sent as START instead of RECONNECT and nothing happens as we don't have state machine trigged from REBOOT with START event. We should fix it. Isn't it? bq. [DECOMMISSIONED - RUNNING] simply because we are not supporting recommission? Yes. IMO, Recommission is a *must* to have if we claim YARN support decommission. bq. [LOST - NEW/UNHELATHY/DECOMMISSIONED] from the code, I can see the node is actually gone from RM's point of view once the node expires node is just go to RMContext.inactiveNodes. But it is possible for node to heartbeat with status update again (cases like: network outage and come back, node VM are suspended or freeze, clock unsynchronized, etc.) when its status is put into LOST, and we don't have any code to handle this. We should fix it. Isn't it? It seems to me that many state transitions are missing in above discuss cases, we can file a separate JIRA to address this. Thoughts? Replace set resource change on RMNode/SchedulerNode directly with event notification. - Key: YARN-1506 URL: https://issues.apache.org/jira/browse/YARN-1506 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Priority: Blocker Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, YARN-1506-v5.patch According to Vinod's comments on YARN-312 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087), we should replace RMNode.setResourceOption() with some resource change event. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1569) For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting
[ https://issues.apache.org/jira/browse/YARN-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1569: - Description: As following: http://wiki.apache.org/hadoop/CodeReviewChecklist, we should always check appropriate type before casting. handle(SchedulerEvent) in FifoScheduler and CapacityScheduler didn't check so far (no bug there now) but should be improved as FairScheduler. was:As following: http://wiki.apache.org/hadoop/CodeReviewChecklist, we should always check appropriate type before casting. handle(SchedulerEvent) in FifoScheduler and CapacityScheduler didn't check so far (no bug there now) but should be improved as FairScheduler. For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting - Key: YARN-1569 URL: https://issues.apache.org/jira/browse/YARN-1569 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Junping Du Priority: Minor Labels: newbie As following: http://wiki.apache.org/hadoop/CodeReviewChecklist, we should always check appropriate type before casting. handle(SchedulerEvent) in FifoScheduler and CapacityScheduler didn't check so far (no bug there now) but should be improved as FairScheduler. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1569) For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting
[ https://issues.apache.org/jira/browse/YARN-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1569: - Summary: For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting (was: For handl(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting) For handle(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting - Key: YARN-1569 URL: https://issues.apache.org/jira/browse/YARN-1569 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Junping Du Priority: Minor Labels: newbie As following: http://wiki.apache.org/hadoop/CodeReviewChecklist, we should always check appropriate type before casting. handle(SchedulerEvent) in FifoScheduler and CapacityScheduler didn't check so far (no bug there now) but should be improved as FairScheduler. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1569) For handl(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting
Junping Du created YARN-1569: Summary: For handl(SchedulerEvent) in FifoScheduler and CapacityScheduler, SchedulerEvent should get checked (instanceof) for appropriate type before casting Key: YARN-1569 URL: https://issues.apache.org/jira/browse/YARN-1569 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Junping Du Priority: Minor As following: http://wiki.apache.org/hadoop/CodeReviewChecklist, we should always check appropriate type before casting. handle(SchedulerEvent) in FifoScheduler and CapacityScheduler didn't check so far (no bug there now) but should be improved as FairScheduler. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1568) Rename clusterid to clusterId in ActiveRMInfoProto
[ https://issues.apache.org/jira/browse/YARN-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865034#comment-13865034 ] Hudson commented on YARN-1568: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4974 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4974/]) YARN-1568. Rename clusterid to clusterId in ActiveRMInfoProto (kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1556435) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/EmbeddedElectorService.java Rename clusterid to clusterId in ActiveRMInfoProto --- Key: YARN-1568 URL: https://issues.apache.org/jira/browse/YARN-1568 Project: Hadoop YARN Issue Type: Task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Trivial Fix For: 2.4.0 Attachments: yarn-1568-1.patch, yarn-1568-1.patch YARN-1029 introduces ActiveRMInfoProto - just realized it defines a field clusterid, which is inconsistent with other fields. Better to fix it immediately than leave the inconsistency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1531) Update yarn command document
[ https://issues.apache.org/jira/browse/YARN-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865050#comment-13865050 ] Akira AJISAKA commented on YARN-1531: - [~kkambatl], thanks for your comment! I'll split the patch. Update yarn command document Key: YARN-1531 URL: https://issues.apache.org/jira/browse/YARN-1531 Project: Hadoop YARN Issue Type: Bug Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: documentaion Attachments: YARN-1531.patch There are some options which are not written to Yarn Command document. For example, yarn rmadmin command options are as follows: {code} Usage: yarn rmadmin -refreshQueues -refreshNodes -refreshSuperUserGroupsConfiguration -refreshUserToGroupsMappings -refreshAdminAcls -refreshServiceAcl -getGroups [username] -help [cmd] -transitionToActive serviceId -transitionToStandby serviceId -failover [--forcefence] [--forceactive] serviceId serviceId -getServiceState serviceId -checkHealth serviceId {code} But some of the new options such as -getGroups, -transitionToActive, and -transitionToStandby are not documented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1531) Update yarn command document
[ https://issues.apache.org/jira/browse/YARN-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated YARN-1531: Attachment: YARN-1531.2.patch Attaching a patch except formatting. Update yarn command document Key: YARN-1531 URL: https://issues.apache.org/jira/browse/YARN-1531 Project: Hadoop YARN Issue Type: Bug Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: documentaion Attachments: YARN-1531.2.patch, YARN-1531.patch There are some options which are not written to Yarn Command document. For example, yarn rmadmin command options are as follows: {code} Usage: yarn rmadmin -refreshQueues -refreshNodes -refreshSuperUserGroupsConfiguration -refreshUserToGroupsMappings -refreshAdminAcls -refreshServiceAcl -getGroups [username] -help [cmd] -transitionToActive serviceId -transitionToStandby serviceId -failover [--forcefence] [--forceactive] serviceId serviceId -getServiceState serviceId -checkHealth serviceId {code} But some of the new options such as -getGroups, -transitionToActive, and -transitionToStandby are not documented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1531) Update yarn command document
[ https://issues.apache.org/jira/browse/YARN-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865083#comment-13865083 ] Hadoop QA commented on YARN-1531: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621919/YARN-1531.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+0 tests included{color}. The patch appears to be a documentation patch that doesn't require tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2819//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2819//console This message is automatically generated. Update yarn command document Key: YARN-1531 URL: https://issues.apache.org/jira/browse/YARN-1531 Project: Hadoop YARN Issue Type: Bug Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: documentaion Attachments: YARN-1531.2.patch, YARN-1531.patch There are some options which are not written to Yarn Command document. For example, yarn rmadmin command options are as follows: {code} Usage: yarn rmadmin -refreshQueues -refreshNodes -refreshSuperUserGroupsConfiguration -refreshUserToGroupsMappings -refreshAdminAcls -refreshServiceAcl -getGroups [username] -help [cmd] -transitionToActive serviceId -transitionToStandby serviceId -failover [--forcefence] [--forceactive] serviceId serviceId -getServiceState serviceId -checkHealth serviceId {code} But some of the new options such as -getGroups, -transitionToActive, and -transitionToStandby are not documented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1293) TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk
[ https://issues.apache.org/jira/browse/YARN-1293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865107#comment-13865107 ] Akira AJISAKA commented on YARN-1293: - Thanks [~jianhe] and [~ozawa]! TestContainerLaunch.testInvalidEnvSyntaxDiagnostics fails on trunk -- Key: YARN-1293 URL: https://issues.apache.org/jira/browse/YARN-1293 Project: Hadoop YARN Issue Type: Bug Environment: linux Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Fix For: 2.3.0 Attachments: YARN-1293.1.patch {quote} --- Test set: org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch --- Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.655 sec FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch testInvalidEnvSyntaxDiagnostics(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch) Time elapsed: 0.114 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testInvalidEnvSyntaxDiagnostics(TestContainerLaunch.java:273) {quote} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1570) Formatting the lines within 80 chars in YarnCommands.apt.vm
Akira AJISAKA created YARN-1570: --- Summary: Formatting the lines within 80 chars in YarnCommands.apt.vm Key: YARN-1570 URL: https://issues.apache.org/jira/browse/YARN-1570 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.2.0 Reporter: Akira AJISAKA Priority: Minor Fix For: 2.4.0 In YarnCommands.apt.vm, there are some lines longer than 80 characters. For example: {code} Yarn commands are invoked by the bin/yarn script. Running the yarn script without any arguments prints the description for all commands. {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1531) Update yarn command document
[ https://issues.apache.org/jira/browse/YARN-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865114#comment-13865114 ] Akira AJISAKA commented on YARN-1531: - Created YARN-1570 for formatting. Update yarn command document Key: YARN-1531 URL: https://issues.apache.org/jira/browse/YARN-1531 Project: Hadoop YARN Issue Type: Bug Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: documentaion Attachments: YARN-1531.2.patch, YARN-1531.patch There are some options which are not written to Yarn Command document. For example, yarn rmadmin command options are as follows: {code} Usage: yarn rmadmin -refreshQueues -refreshNodes -refreshSuperUserGroupsConfiguration -refreshUserToGroupsMappings -refreshAdminAcls -refreshServiceAcl -getGroups [username] -help [cmd] -transitionToActive serviceId -transitionToStandby serviceId -failover [--forcefence] [--forceactive] serviceId serviceId -getServiceState serviceId -checkHealth serviceId {code} But some of the new options such as -getGroups, -transitionToActive, and -transitionToStandby are not documented. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1041) Protocol changes for RM to bind and notify a restarted AM of existing containers
[ https://issues.apache.org/jira/browse/YARN-1041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865123#comment-13865123 ] Sandy Ryza commented on YARN-1041: -- Took a look at the Fair Scheduler changes. A couple nits: {code} +ContainerId amContainerId = +this.rmContext.getRMApps().get(applicationId).getCurrentAppAttempt() + .getMasterContainer().getId(); {code} Other references to rmContext in this file do not use this. {code} + private SchedulerApplicationAttempt getCurrentAttemptForContainer( ContainerId containerId) { SchedulerApplication app = @@ -1361,5 +1384,4 @@ public void onReload(AllocationConfiguration queueInfo) { queue.collectSchedulerApplications(apps); return apps; } - {code} False whitespace changes Protocol changes for RM to bind and notify a restarted AM of existing containers Key: YARN-1041 URL: https://issues.apache.org/jira/browse/YARN-1041 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 3.0.0 Reporter: Steve Loughran Assignee: Jian He Attachments: YARN-1041.1.patch For long lived containers we don't want the AM to be a SPOF. When the RM restarts a (failed) AM, it should be given the list of containers it had already been allocated. the AM should then be able to contact the NMs to get details on them. NMs would also need to do any binding of the containers needed to handle a moved/restarted AM. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1166: -- Attachment: YARN-1166.7.patch bq. In FairScheduler, log if application == null ? Add the log not only for FairScheduler#removeApplication, but for CapacityScheduler#doneApplication and FifoScheduler#doneApplication bq. There are things other than queue metrics. For example, LeafQueue.activeApplications and PendingApplications. These two are actually recording the attempts. But I remember those two are exposed on scheduler UI as schedulable and non-schedulable apps. Can you check if these two collections are also needed be associated with application ? As is mentioned in my last comment, active apps and pending apps are changed with app-attempt trigger. The two metrics may increase and decrease during the life cycle of an application given there're multiple attempts. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Zhijie Shen Priority: Blocker Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, YARN-1166.5.patch, YARN-1166.6.patch, YARN-1166.7.patch, YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865136#comment-13865136 ] Jian He commented on YARN-1166: --- bq. As is mentioned in my last comment, active apps and pending apps are changed with app-attempt trigger What I meant is the activeApplications and PendingApplications inside LeafQueue, these two also end up showing metrics on the scheduler UI and these two are different from the pending/running metrics of the QueueMetrics. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Zhijie Shen Priority: Blocker Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, YARN-1166.5.patch, YARN-1166.6.patch, YARN-1166.7.patch, YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-321) Generic application history service
[ https://issues.apache.org/jira/browse/YARN-321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865140#comment-13865140 ] Shinichi Yamashita commented on YARN-321: - I confirmed attached design document. And I have two questions about FileSystemApplicationHistoryStore. 1. Does it provide a function to set maximum files and maximum retention period of AppicationHistory to store in HDFS? 2. When there are many AppilicationHistory in HDFS, does it not limit the number of the reading of ApplicationHistory? Generic application history service --- Key: YARN-321 URL: https://issues.apache.org/jira/browse/YARN-321 Project: Hadoop YARN Issue Type: Improvement Reporter: Luke Lu Attachments: AHS Diagram.pdf, ApplicationHistoryServiceHighLevel.pdf, Generic Application History - Design-20131219.pdf, HistoryStorageDemo.java The mapreduce job history server currently needs to be deployed as a trusted server in sync with the mapreduce runtime. Every new application would need a similar application history server. Having to deploy O(T*V) (where T is number of type of application, V is number of version of application) trusted servers is clearly not scalable. Job history storage handling itself is pretty generic: move the logs and history data into a particular directory for later serving. Job history data is already stored as json (or binary avro). I propose that we create only one trusted application history server, which can have a generic UI (display json as a tree of strings) as well. Specific application/version can deploy untrusted webapps (a la AMs) to query the application history server and interpret the json for its specific UI and/or analytics. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865145#comment-13865145 ] Hadoop QA commented on YARN-1166: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621931/YARN-1166.7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2820//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2820//console This message is automatically generated. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Zhijie Shen Priority: Blocker Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, YARN-1166.5.patch, YARN-1166.6.patch, YARN-1166.7.patch, YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1571) Don't allow periods in Fair Scheduler queue names
Sandy Ryza created YARN-1571: Summary: Don't allow periods in Fair Scheduler queue names Key: YARN-1571 URL: https://issues.apache.org/jira/browse/YARN-1571 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Periods can't be used in fair scheduler queue names because they're used as delimiters between queues and their parents. Maybe we should replace them with underscores or something. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'
[ https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865154#comment-13865154 ] Zhijie Shen commented on YARN-1166: --- bq. What I meant is the activeApplications and PendingApplications inside LeafQueue, these two also end up showing metrics on the scheduler UI and these two are different from the pending/running metrics of the QueueMetrics. Those two metrics alter with application attempt being added/activated/removed, which is similar to those in QueueMetrics. IMHO, it is reasonable that the pending/active metrics (either in LeafQueue or QueueMetrics) is binding to application attempt, given one application can at most have one attempt at any time. YARN 'appsFailed' metric should be of type 'counter' Key: YARN-1166 URL: https://issues.apache.org/jira/browse/YARN-1166 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-beta Reporter: Srimanth Gunturi Assignee: Zhijie Shen Priority: Blocker Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, YARN-1166.5.patch, YARN-1166.6.patch, YARN-1166.7.patch, YARN-1166.patch Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of type 'guage' - which means the exact value will be reported. All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) are all of type 'counter' - meaning Ganglia will use slope to provide deltas between time-points. To be consistent, AppsFailed metric should also be of type 'counter'. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.
[ https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-1506: - Attachment: YARN-1506-v6.patch Address [~jianhe]'s comments in v6 patch. Replace set resource change on RMNode/SchedulerNode directly with event notification. - Key: YARN-1506 URL: https://issues.apache.org/jira/browse/YARN-1506 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Priority: Blocker Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, YARN-1506-v5.patch, YARN-1506-v6.patch According to Vinod's comments on YARN-312 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087), we should replace RMNode.setResourceOption() with some resource change event. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.
[ https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865168#comment-13865168 ] Hadoop QA commented on YARN-1506: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621936/YARN-1506-v6.patch against trunk revision . {color:red}-1 patch{color}. Trunk compilation may be broken. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2822//console This message is automatically generated. Replace set resource change on RMNode/SchedulerNode directly with event notification. - Key: YARN-1506 URL: https://issues.apache.org/jira/browse/YARN-1506 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Priority: Blocker Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, YARN-1506-v5.patch, YARN-1506-v6.patch According to Vinod's comments on YARN-312 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087), we should replace RMNode.setResourceOption() with some resource change event. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.
[ https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13865185#comment-13865185 ] Hadoop QA commented on YARN-1506: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12621936/YARN-1506-v6.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2821//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2821//console This message is automatically generated. Replace set resource change on RMNode/SchedulerNode directly with event notification. - Key: YARN-1506 URL: https://issues.apache.org/jira/browse/YARN-1506 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, scheduler Reporter: Junping Du Assignee: Junping Du Priority: Blocker Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, YARN-1506-v3.patch, YARN-1506-v4.patch, YARN-1506-v5.patch, YARN-1506-v6.patch According to Vinod's comments on YARN-312 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087), we should replace RMNode.setResourceOption() with some resource change event. -- This message was sent by Atlassian JIRA (v6.1.5#6160)