[jira] [Updated] (YARN-584) In scheduler web UIs, queues unexpand on refresh
[ https://issues.apache.org/jira/browse/YARN-584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-584: Summary: In scheduler web UIs, queues unexpand on refresh (was: In fair scheduler web UI, queues unexpand on refresh) In scheduler web UIs, queues unexpand on refresh Key: YARN-584 URL: https://issues.apache.org/jira/browse/YARN-584 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Harshit Daga Labels: newbie Attachments: YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch In the fair scheduler web UI, you can expand queue information. Refreshing the page causes the expansions to go away, which is annoying for someone who wants to monitor the scheduler page and needs to reopen all the queues they care about each time. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1423) Support queue placement by secondary group in the Fair Scheduler
Sandy Ryza created YARN-1423: Summary: Support queue placement by secondary group in the Fair Scheduler Key: YARN-1423 URL: https://issues.apache.org/jira/browse/YARN-1423 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Sandy Ryza -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (YARN-1420) TestRMContainerAllocator#testUpdatedNodes fails
[ https://issues.apache.org/jira/browse/YARN-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA reassigned YARN-1420: --- Assignee: Akira AJISAKA TestRMContainerAllocator#testUpdatedNodes fails --- Key: YARN-1420 URL: https://issues.apache.org/jira/browse/YARN-1420 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: Akira AJISAKA From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1607/console : {code} Running org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 65.78 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator testUpdatedNodes(org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator) Time elapsed: 3.125 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator.testUpdatedNodes(TestRMContainerAllocator.java:779) {code} This assertion fails: {code} Assert.assertTrue(allocator.getJobUpdatedNodeEvents().isEmpty()); {code} The List returned by allocator.getJobUpdatedNodeEvents() is: [EventType: JOB_UPDATED_NODES] -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1420) TestRMContainerAllocator#testUpdatedNodes fails
[ https://issues.apache.org/jira/browse/YARN-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826367#comment-13826367 ] Akira AJISAKA commented on YARN-1420: - I reproduced this issue on my environment. I'm investigating why the list is not empty. TestRMContainerAllocator#testUpdatedNodes fails --- Key: YARN-1420 URL: https://issues.apache.org/jira/browse/YARN-1420 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: Akira AJISAKA From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1607/console : {code} Running org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 65.78 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator testUpdatedNodes(org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator) Time elapsed: 3.125 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator.testUpdatedNodes(TestRMContainerAllocator.java:779) {code} This assertion fails: {code} Assert.assertTrue(allocator.getJobUpdatedNodeEvents().isEmpty()); {code} The List returned by allocator.getJobUpdatedNodeEvents() is: [EventType: JOB_UPDATED_NODES] -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1420) TestRMContainerAllocator#testUpdatedNodes fails
[ https://issues.apache.org/jira/browse/YARN-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826387#comment-13826387 ] Akira AJISAKA commented on YARN-1420: - This issue seems to be duplicated of MAPREDUCE-5427. [~yuzhih...@gmail.com], may I close this issue? TestRMContainerAllocator#testUpdatedNodes fails --- Key: YARN-1420 URL: https://issues.apache.org/jira/browse/YARN-1420 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: Akira AJISAKA From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1607/console : {code} Running org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 65.78 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator testUpdatedNodes(org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator) Time elapsed: 3.125 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator.testUpdatedNodes(TestRMContainerAllocator.java:779) {code} This assertion fails: {code} Assert.assertTrue(allocator.getJobUpdatedNodeEvents().isEmpty()); {code} The List returned by allocator.getJobUpdatedNodeEvents() is: [EventType: JOB_UPDATED_NODES] -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1420) TestRMContainerAllocator#testUpdatedNodes fails
[ https://issues.apache.org/jira/browse/YARN-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated YARN-1420: Assignee: (was: Akira AJISAKA) TestRMContainerAllocator#testUpdatedNodes fails --- Key: YARN-1420 URL: https://issues.apache.org/jira/browse/YARN-1420 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1607/console : {code} Running org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 65.78 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator testUpdatedNodes(org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator) Time elapsed: 3.125 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator.testUpdatedNodes(TestRMContainerAllocator.java:779) {code} This assertion fails: {code} Assert.assertTrue(allocator.getJobUpdatedNodeEvents().isEmpty()); {code} The List returned by allocator.getJobUpdatedNodeEvents() is: [EventType: JOB_UPDATED_NODES] -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1419) TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7
[ https://issues.apache.org/jira/browse/YARN-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826407#comment-13826407 ] Hudson commented on YARN-1419: -- FAILURE: Integrated in Hadoop-Hdfs-0.23-Build #795 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/795/]) svn merge -c 1543117 FIXES: YARN-1419. TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7. Contributed by Jonathan Eagles (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543122) * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7 Key: YARN-1419 URL: https://issues.apache.org/jira/browse/YARN-1419 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 3.0.0, 2.3.0, 0.23.10 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Minor Labels: java7 Fix For: 3.0.0, 2.3.0, 0.23.10 Attachments: YARN-1419.patch, YARN-1419.patch QueueMetrics holds its data in a static variable causing metrics to bleed over from test to test. clearQueueMetrics is to be called for tests that need to measure metrics correctly for a single test. jdk7 comes into play since tests are run out of order, and in the case make the metrics unreliable. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call
[ https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826421#comment-13826421 ] Rohith Sharma K S commented on YARN-1398: - Hi Sunil, I think this is same as https://issues.apache.org/jira/i#browse/YARN-325. Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call --- Key: YARN-1398 URL: https://issues.apache.org/jira/browse/YARN-1398 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.2.0 Reporter: Sunil G Priority: Critical getQueueInfo in parentQueue will call child.getQueueInfo(). This will try acquire the leaf queue lock over parent queue lock. Now at same time if a completedContainer call comes and acquired LeafQueue lock and it will wait for ParentQueue's completedConatiner call. This lock usage is not in synchronous and can lead to deadlock. With JCarder, this is showing as a potential deadlock scenario. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-584) In scheduler web UIs, queues unexpand on refresh
[ https://issues.apache.org/jira/browse/YARN-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826426#comment-13826426 ] Hudson commented on YARN-584: - SUCCESS: Integrated in Hadoop-trunk-Commit #4759 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4759/]) YARN-584. In scheduler web UIs, queues unexpand on refresh. (Harshit Daga via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543350) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/FairSchedulerPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/SchedulerPageUtil.java In scheduler web UIs, queues unexpand on refresh Key: YARN-584 URL: https://issues.apache.org/jira/browse/YARN-584 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Harshit Daga Labels: newbie Fix For: 2.3.0 Attachments: YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch In the fair scheduler web UI, you can expand queue information. Refreshing the page causes the expansions to go away, which is annoying for someone who wants to monitor the scheduler page and needs to reopen all the queues they care about each time. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1332) In TestAMRMClient, replace assertTrue with assertEquals where possible
[ https://issues.apache.org/jira/browse/YARN-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826442#comment-13826442 ] Hadoop QA commented on YARN-1332: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614153/YARN-1332-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2483//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2483//console This message is automatically generated. In TestAMRMClient, replace assertTrue with assertEquals where possible -- Key: YARN-1332 URL: https://issues.apache.org/jira/browse/YARN-1332 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sebastian Wong Priority: Minor Labels: newbie Attachments: YARN-1332-2.patch, YARN-1332.patch TestAMRMClient uses a lot of assertTrue(amClient.ask.size() == 0) where assertEquals(0, amClient.ask.size()) would make it easier to see why it's failing at a glance. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1419) TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7
[ https://issues.apache.org/jira/browse/YARN-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826460#comment-13826460 ] Hudson commented on YARN-1419: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #396 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/396/]) YARN-1419. TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7. Contributed by Jonathan Eagles (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543117) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7 Key: YARN-1419 URL: https://issues.apache.org/jira/browse/YARN-1419 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 3.0.0, 2.3.0, 0.23.10 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Minor Labels: java7 Fix For: 3.0.0, 2.3.0, 0.23.10 Attachments: YARN-1419.patch, YARN-1419.patch QueueMetrics holds its data in a static variable causing metrics to bleed over from test to test. clearQueueMetrics is to be called for tests that need to measure metrics correctly for a single test. jdk7 comes into play since tests are run out of order, and in the case make the metrics unreliable. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1210) During RM restart, RM should start a new attempt only when previous attempt exits for real
[ https://issues.apache.org/jira/browse/YARN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826464#comment-13826464 ] Hudson commented on YARN-1210: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #396 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/396/]) YARN-1210. Changed RM to start new app-attempts on RM restart only after ensuring that previous AM exited or after expiry time. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543310) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdater.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptState.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java During RM restart, RM should start a new attempt only when previous attempt exits for real -- Key: YARN-1210 URL: https://issues.apache.org/jira/browse/YARN-1210 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Fix For: 2.3.0 Attachments: YARN-1210.1.patch, YARN-1210.2.patch, YARN-1210.3.patch, YARN-1210.4.patch, YARN-1210.4.patch, YARN-1210.5.patch, YARN-1210.6.patch, YARN-1210.7.patch When RM recovers, it can wait for existing AMs to contact RM back and then kill them forcefully before even starting a new AM. Worst case, RM will start a new AppAttempt after waiting for 10 mins ( the
[jira] [Commented] (YARN-709) verify that new jobs submitted with old RM delegation tokens after RM restart are accepted
[ https://issues.apache.org/jira/browse/YARN-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826463#comment-13826463 ] Hudson commented on YARN-709: - SUCCESS: Integrated in Hadoop-Yarn-trunk #396 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/396/]) YARN-709. Added tests to verify validity of delegation tokens and logging of appsummary after RM restart. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543269) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java verify that new jobs submitted with old RM delegation tokens after RM restart are accepted -- Key: YARN-709 URL: https://issues.apache.org/jira/browse/YARN-709 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Fix For: 2.3.0 Attachments: YARN-709.1.patch More elaborate test for restoring RM delegation tokens on RM restart. New jobs with old RM delegation tokens should be accepted by new RM as long as the token is still valid -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-674) Slow or failing DelegationToken renewals on submission itself make RM unavailable
[ https://issues.apache.org/jira/browse/YARN-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826471#comment-13826471 ] Hudson commented on YARN-674: - SUCCESS: Integrated in Hadoop-Yarn-trunk #396 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/396/]) YARN-674. Fixed ResourceManager to renew DelegationTokens on submission asynchronously to work around potential slowness in state-store. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543312) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java Slow or failing DelegationToken renewals on submission itself make RM unavailable - Key: YARN-674 URL: https://issues.apache.org/jira/browse/YARN-674 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Fix For: 2.3.0 Attachments: YARN-674.1.patch, YARN-674.10.patch, YARN-674.2.patch, YARN-674.3.patch, YARN-674.4.patch, YARN-674.5.patch, YARN-674.5.patch, YARN-674.6.patch, YARN-674.7.patch, YARN-674.8.patch, YARN-674.9.patch This was caused by YARN-280. A slow or a down NameNode for will make it look like RM is unavailable as it may run out of RPC handlers due to blocked client submissions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-584) In scheduler web UIs, queues unexpand on refresh
[ https://issues.apache.org/jira/browse/YARN-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826489#comment-13826489 ] Hudson commented on YARN-584: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1613 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1613/]) YARN-584. In scheduler web UIs, queues unexpand on refresh. (Harshit Daga via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543350) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/FairSchedulerPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/SchedulerPageUtil.java In scheduler web UIs, queues unexpand on refresh Key: YARN-584 URL: https://issues.apache.org/jira/browse/YARN-584 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Harshit Daga Labels: newbie Fix For: 2.3.0 Attachments: YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch In the fair scheduler web UI, you can expand queue information. Refreshing the page causes the expansions to go away, which is annoying for someone who wants to monitor the scheduler page and needs to reopen all the queues they care about each time. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-709) verify that new jobs submitted with old RM delegation tokens after RM restart are accepted
[ https://issues.apache.org/jira/browse/YARN-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826491#comment-13826491 ] Hudson commented on YARN-709: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1613 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1613/]) YARN-709. Added tests to verify validity of delegation tokens and logging of appsummary after RM restart. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543269) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java verify that new jobs submitted with old RM delegation tokens after RM restart are accepted -- Key: YARN-709 URL: https://issues.apache.org/jira/browse/YARN-709 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Fix For: 2.3.0 Attachments: YARN-709.1.patch More elaborate test for restoring RM delegation tokens on RM restart. New jobs with old RM delegation tokens should be accepted by new RM as long as the token is still valid -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1419) TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7
[ https://issues.apache.org/jira/browse/YARN-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826488#comment-13826488 ] Hudson commented on YARN-1419: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1613 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1613/]) YARN-1419. TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7. Contributed by Jonathan Eagles (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543117) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7 Key: YARN-1419 URL: https://issues.apache.org/jira/browse/YARN-1419 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 3.0.0, 2.3.0, 0.23.10 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Minor Labels: java7 Fix For: 3.0.0, 2.3.0, 0.23.10 Attachments: YARN-1419.patch, YARN-1419.patch QueueMetrics holds its data in a static variable causing metrics to bleed over from test to test. clearQueueMetrics is to be called for tests that need to measure metrics correctly for a single test. jdk7 comes into play since tests are run out of order, and in the case make the metrics unreliable. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1210) During RM restart, RM should start a new attempt only when previous attempt exits for real
[ https://issues.apache.org/jira/browse/YARN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826492#comment-13826492 ] Hudson commented on YARN-1210: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1613 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1613/]) YARN-1210. Changed RM to start new app-attempts on RM restart only after ensuring that previous AM exited or after expiry time. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543310) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdater.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptState.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java During RM restart, RM should start a new attempt only when previous attempt exits for real -- Key: YARN-1210 URL: https://issues.apache.org/jira/browse/YARN-1210 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Fix For: 2.3.0 Attachments: YARN-1210.1.patch, YARN-1210.2.patch, YARN-1210.3.patch, YARN-1210.4.patch, YARN-1210.4.patch, YARN-1210.5.patch, YARN-1210.6.patch, YARN-1210.7.patch When RM recovers, it can wait for existing AMs to contact RM back and then kill them forcefully before even starting a new AM. Worst case, RM will start a new AppAttempt after waiting for
[jira] [Commented] (YARN-674) Slow or failing DelegationToken renewals on submission itself make RM unavailable
[ https://issues.apache.org/jira/browse/YARN-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826499#comment-13826499 ] Hudson commented on YARN-674: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1613 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1613/]) YARN-674. Fixed ResourceManager to renew DelegationTokens on submission asynchronously to work around potential slowness in state-store. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543312) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java Slow or failing DelegationToken renewals on submission itself make RM unavailable - Key: YARN-674 URL: https://issues.apache.org/jira/browse/YARN-674 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Fix For: 2.3.0 Attachments: YARN-674.1.patch, YARN-674.10.patch, YARN-674.2.patch, YARN-674.3.patch, YARN-674.4.patch, YARN-674.5.patch, YARN-674.5.patch, YARN-674.6.patch, YARN-674.7.patch, YARN-674.8.patch, YARN-674.9.patch This was caused by YARN-280. A slow or a down NameNode for will make it look like RM is unavailable as it may run out of RPC handlers due to blocked client submissions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-709) verify that new jobs submitted with old RM delegation tokens after RM restart are accepted
[ https://issues.apache.org/jira/browse/YARN-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826504#comment-13826504 ] Hudson commented on YARN-709: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1587 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1587/]) YARN-709. Added tests to verify validity of delegation tokens and logging of appsummary after RM restart. Contributed by Jian He. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543269) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java verify that new jobs submitted with old RM delegation tokens after RM restart are accepted -- Key: YARN-709 URL: https://issues.apache.org/jira/browse/YARN-709 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Jian He Fix For: 2.3.0 Attachments: YARN-709.1.patch More elaborate test for restoring RM delegation tokens on RM restart. New jobs with old RM delegation tokens should be accepted by new RM as long as the token is still valid -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1210) During RM restart, RM should start a new attempt only when previous attempt exits for real
[ https://issues.apache.org/jira/browse/YARN-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826505#comment-13826505 ] Hudson commented on YARN-1210: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1587 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1587/]) YARN-1210. Changed RM to start new app-attempts on RM restart only after ensuring that previous AM exited or after expiry time. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543310) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdater.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptState.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNM.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/TestRMAppAttemptTransitions.java During RM restart, RM should start a new attempt only when previous attempt exits for real -- Key: YARN-1210 URL: https://issues.apache.org/jira/browse/YARN-1210 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Fix For: 2.3.0 Attachments: YARN-1210.1.patch, YARN-1210.2.patch, YARN-1210.3.patch, YARN-1210.4.patch, YARN-1210.4.patch, YARN-1210.5.patch, YARN-1210.6.patch, YARN-1210.7.patch When RM recovers, it can wait for existing AMs to contact RM back and then kill them forcefully before even starting a new AM. Worst case, RM will start a new AppAttempt after waiting for 10 mins (
[jira] [Commented] (YARN-1419) TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7
[ https://issues.apache.org/jira/browse/YARN-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826501#comment-13826501 ] Hudson commented on YARN-1419: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1587 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1587/]) YARN-1419. TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7. Contributed by Jonathan Eagles (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543117) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/TestFifoScheduler.java TestFifoScheduler.testAppAttemptMetrics fails intermittently under jdk7 Key: YARN-1419 URL: https://issues.apache.org/jira/browse/YARN-1419 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 3.0.0, 2.3.0, 0.23.10 Reporter: Jonathan Eagles Assignee: Jonathan Eagles Priority: Minor Labels: java7 Fix For: 3.0.0, 2.3.0, 0.23.10 Attachments: YARN-1419.patch, YARN-1419.patch QueueMetrics holds its data in a static variable causing metrics to bleed over from test to test. clearQueueMetrics is to be called for tests that need to measure metrics correctly for a single test. jdk7 comes into play since tests are run out of order, and in the case make the metrics unreliable. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-674) Slow or failing DelegationToken renewals on submission itself make RM unavailable
[ https://issues.apache.org/jira/browse/YARN-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826512#comment-13826512 ] Hudson commented on YARN-674: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1587 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1587/]) YARN-674. Fixed ResourceManager to renew DelegationTokens on submission asynchronously to work around potential slowness in state-store. Contributed by Omkar Vinit Joshi. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543312) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/DelegationTokenRenewer.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAppManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestDelegationTokenRenewer.java Slow or failing DelegationToken renewals on submission itself make RM unavailable - Key: YARN-674 URL: https://issues.apache.org/jira/browse/YARN-674 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Vinod Kumar Vavilapalli Assignee: Omkar Vinit Joshi Fix For: 2.3.0 Attachments: YARN-674.1.patch, YARN-674.10.patch, YARN-674.2.patch, YARN-674.3.patch, YARN-674.4.patch, YARN-674.5.patch, YARN-674.5.patch, YARN-674.6.patch, YARN-674.7.patch, YARN-674.8.patch, YARN-674.9.patch This was caused by YARN-280. A slow or a down NameNode for will make it look like RM is unavailable as it may run out of RPC handlers due to blocked client submissions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-584) In scheduler web UIs, queues unexpand on refresh
[ https://issues.apache.org/jira/browse/YARN-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826502#comment-13826502 ] Hudson commented on YARN-584: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1587 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1587/]) YARN-584. In scheduler web UIs, queues unexpand on refresh. (Harshit Daga via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543350) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/CapacitySchedulerPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/FairSchedulerPage.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/SchedulerPageUtil.java In scheduler web UIs, queues unexpand on refresh Key: YARN-584 URL: https://issues.apache.org/jira/browse/YARN-584 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Harshit Daga Labels: newbie Fix For: 2.3.0 Attachments: YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch, YARN-584-branch-2.2.0.patch In the fair scheduler web UI, you can expand queue information. Refreshing the page causes the expansions to go away, which is annoying for someone who wants to monitor the scheduler page and needs to reopen all the queues they care about each time. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1307) Rethink znode structure for RM HA
[ https://issues.apache.org/jira/browse/YARN-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826595#comment-13826595 ] Tsuyoshi OZAWA commented on YARN-1307: -- [~jianhe] and [~bikassaha], could you review the latest patch? Rethink znode structure for RM HA - Key: YARN-1307 URL: https://issues.apache.org/jira/browse/YARN-1307 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: YARN-1307.1.patch, YARN-1307.2.patch, YARN-1307.3.patch, YARN-1307.4-2.patch, YARN-1307.4-3.patch, YARN-1307.4.patch, YARN-1307.5.patch, YARN-1307.6.patch Rethink for znode structure for RM HA is proposed in some JIRAs(YARN-659, YARN-1222). The motivation of this JIRA is quoted from Bikas' comment in YARN-1222: {quote} We should move to creating a node hierarchy for apps such that all znodes for an app are stored under an app znode instead of the app root znode. This will help in removeApplication and also in scaling better on ZK. The earlier code was written this way to ensure create/delete happens under a root znode for fencing. But given that we have moved to multi-operations globally, this isnt required anymore. {quote} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-786) Expose application resource usage in RM REST API
[ https://issues.apache.org/jira/browse/YARN-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826703#comment-13826703 ] Sandy Ryza commented on YARN-786: - I'll look into this today. Expose application resource usage in RM REST API Key: YARN-786 URL: https://issues.apache.org/jira/browse/YARN-786 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.3.0 Attachments: YARN-786-1.patch, YARN-786-2.patch, YARN-786.patch It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1266) inheriting Application client and History Protocol from base protocol and implement PB service and clients.
[ https://issues.apache.org/jira/browse/YARN-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826759#comment-13826759 ] Vinod Kumar Vavilapalli commented on YARN-1266: --- Adding to Zhijie's coments: - I hope this is a compatible change - moving methods to a super-interface. Can you please check? - Mark application_base_protocol.proto and ApplicationBaseProtocol as some kind of internal interfaces not to be used directly? May be by making them only package private and protected or something. Along with that we'll need to put correct audience and visibility annotations. - Similarly for the client and service wrappers. inheriting Application client and History Protocol from base protocol and implement PB service and clients. --- Key: YARN-1266 URL: https://issues.apache.org/jira/browse/YARN-1266 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: YARN-1266-1.patch, YARN-1266-2.patch, YARN-1266-3.patch, YARN-1266-4.patch Adding ApplicationHistoryProtocolPBService to make web apps to work and changing yarn to run AHS as a seprate process -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1242) AHS start as independent process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayank Bansal updated YARN-1242: Attachment: YARN-1242-3.patch Updating latest patch Thanks, Mayank AHS start as independent process Key: YARN-1242 URL: https://issues.apache.org/jira/browse/YARN-1242 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Mayank Bansal Attachments: YARN-1242-1.patch, YARN-1242-2.patch, YARN-1242-3.patch Add the command in yarn and yarn.cmd to start and stop AHS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1242) AHS start as independent process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826816#comment-13826816 ] Mayank Bansal commented on YARN-1242: - Thanks [~zjshen] for the review bq. 1. Do not depend on RM's log4j Done bq. 2. YARN_HISTORYSERVER_HEAPSIZE should be commented in yarn-env as well Done bq. 3. hadoop-yarn-dist.xml needs to be updated as well. Would you please double check the complete project to see whether there's some other stuff missing for creating a correct distribution? Done bq. 4. Would you please verify starting AHS locally, in particular starting AHS on the machine that RM is not there (including start it as daemon)? Then, you can verify whether AHS is completely independent of RM. Done Thanks, Mayank AHS start as independent process Key: YARN-1242 URL: https://issues.apache.org/jira/browse/YARN-1242 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Mayank Bansal Attachments: YARN-1242-1.patch, YARN-1242-2.patch, YARN-1242-3.patch Add the command in yarn and yarn.cmd to start and stop AHS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1242) Script changes to start AHS as an individual process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1242: -- Summary: Script changes to start AHS as an individual process (was: AHS start as independent process) Script changes to start AHS as an individual process Key: YARN-1242 URL: https://issues.apache.org/jira/browse/YARN-1242 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Mayank Bansal Attachments: YARN-1242-1.patch, YARN-1242-2.patch, YARN-1242-3.patch Add the command in yarn and yarn.cmd to start and stop AHS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-786) Expose application resource usage in RM REST API
[ https://issues.apache.org/jira/browse/YARN-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-786: Attachment: YARN-786-addendum.patch Expose application resource usage in RM REST API Key: YARN-786 URL: https://issues.apache.org/jira/browse/YARN-786 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.3.0 Attachments: YARN-786-1.patch, YARN-786-2.patch, YARN-786-addendum.patch, YARN-786.patch It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1242) Script changes to start AHS as an individual process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826897#comment-13826897 ] Hadoop QA commented on YARN-1242: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614673/YARN-1242-3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-assemblies. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2484//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2484//console This message is automatically generated. Script changes to start AHS as an individual process Key: YARN-1242 URL: https://issues.apache.org/jira/browse/YARN-1242 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Mayank Bansal Attachments: YARN-1242-1.patch, YARN-1242-2.patch, YARN-1242-3.patch Add the command in yarn and yarn.cmd to start and stop AHS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-786) Expose application resource usage in RM REST API
[ https://issues.apache.org/jira/browse/YARN-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826904#comment-13826904 ] Hadoop QA commented on YARN-786: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614680/YARN-786-addendum.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2485//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2485//console This message is automatically generated. Expose application resource usage in RM REST API Key: YARN-786 URL: https://issues.apache.org/jira/browse/YARN-786 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.3.0 Attachments: YARN-786-1.patch, YARN-786-2.patch, YARN-786-addendum.patch, YARN-786.patch It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-786) Expose application resource usage in RM REST API
[ https://issues.apache.org/jira/browse/YARN-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826914#comment-13826914 ] Jason Lowe commented on YARN-786: - +1 for the addendum patch, looks pretty good. Couple of nits but not must-fix before commit: - I'd expect the lack of a scheduler app report to be a relatively common case, so it would be nice to have a pre-built zero resource usage report similar to the DUMMY_APPLICATION_RESOURCE_USAGE_REPORT used by RMAppImpl when someone doesn't have access. - It would be nice to have a test case. There was a similar check-for-null-report testcase in TestRMAppTransitions#testGetAppReport, but it only tested an app in the NEW state and didn't catch this. Expose application resource usage in RM REST API Key: YARN-786 URL: https://issues.apache.org/jira/browse/YARN-786 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.3.0 Attachments: YARN-786-1.patch, YARN-786-2.patch, YARN-786-addendum.patch, YARN-786.patch It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1424) RMAppAttemptImpl should have a dummy ApplicationResourceUsageReport to return when
Sandy Ryza created YARN-1424: Summary: RMAppAttemptImpl should have a dummy ApplicationResourceUsageReport to return when Key: YARN-1424 URL: https://issues.apache.org/jira/browse/YARN-1424 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Sandy Ryza RMAppImpl has a DUMMY_APPLICATION_RESOURCE_USAGE_REPORT to return when the caller of createAndGetApplicationReport doesn't have access. RMAppAttemptImpl should have something similar for getApplicationResourceUsageReport. It also might make sense to put the dummy report into ApplicationResourceUsageReport and allow both to use it. A test would also be useful to verify that RMAppAttemptImpl#getApplicationResourceUsageReport doesn't return null if the scheduler doesn't have a report to return. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1307) Rethink znode structure for RM HA
[ https://issues.apache.org/jira/browse/YARN-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826925#comment-13826925 ] Jian He commented on YARN-1307: --- Sorry for the late response. Patch looks good. One minor thing, as we decided to take care of the version stuff in YARN-1239, things related to that we can remove. Rethink znode structure for RM HA - Key: YARN-1307 URL: https://issues.apache.org/jira/browse/YARN-1307 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: YARN-1307.1.patch, YARN-1307.2.patch, YARN-1307.3.patch, YARN-1307.4-2.patch, YARN-1307.4-3.patch, YARN-1307.4.patch, YARN-1307.5.patch, YARN-1307.6.patch Rethink for znode structure for RM HA is proposed in some JIRAs(YARN-659, YARN-1222). The motivation of this JIRA is quoted from Bikas' comment in YARN-1222: {quote} We should move to creating a node hierarchy for apps such that all znodes for an app are stored under an app znode instead of the app root znode. This will help in removeApplication and also in scaling better on ZK. The earlier code was written this way to ensure create/delete happens under a root znode for fencing. But given that we have moved to multi-operations globally, this isnt required anymore. {quote} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-786) Expose application resource usage in RM REST API
[ https://issues.apache.org/jira/browse/YARN-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826927#comment-13826927 ] Sandy Ryza commented on YARN-786: - Thanks Jason. Committing this now and filed YARN-1424 for these additional changes. Expose application resource usage in RM REST API Key: YARN-786 URL: https://issues.apache.org/jira/browse/YARN-786 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.3.0 Attachments: YARN-786-1.patch, YARN-786-2.patch, YARN-786-addendum.patch, YARN-786.patch It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1424) RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active
[ https://issues.apache.org/jira/browse/YARN-1424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-1424: - Priority: Minor (was: Major) Summary: RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active (was: RMAppAttemptImpl should have a dummy ApplicationResourceUsageReport to return when ) RMAppAttemptImpl should precompute a zeroed ApplicationResourceUsageReport to return when attempt not active Key: YARN-1424 URL: https://issues.apache.org/jira/browse/YARN-1424 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Reporter: Sandy Ryza Priority: Minor Labels: newbie RMAppImpl has a DUMMY_APPLICATION_RESOURCE_USAGE_REPORT to return when the caller of createAndGetApplicationReport doesn't have access. RMAppAttemptImpl should have something similar for getApplicationResourceUsageReport. It also might make sense to put the dummy report into ApplicationResourceUsageReport and allow both to use it. A test would also be useful to verify that RMAppAttemptImpl#getApplicationResourceUsageReport doesn't return null if the scheduler doesn't have a report to return. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.
[ https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826974#comment-13826974 ] Omkar Vinit Joshi commented on YARN-744: Thanks [~bikassaha] addressed your comments. Attaching a new patch. Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated. - Key: YARN-744 URL: https://issues.apache.org/jira/browse/YARN-744 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Bikas Saha Assignee: Omkar Vinit Joshi Priority: Minor Attachments: MAPREDUCE-3899-branch-0.23.patch, YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.2.patch, YARN-744.patch Looks like the lock taken in this is broken. It takes a lock on lastResponse object and then puts a new lastResponse object into the map. At this point a new thread entering this function will get a new lastResponse object and will be able to take its lock and enter the critical section. Presumably we want to limit one response per app attempt. So the lock could be taken on the ApplicationAttemptId key of the response map object. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.
[ https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-744: --- Attachment: YARN-744.2.patch Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated. - Key: YARN-744 URL: https://issues.apache.org/jira/browse/YARN-744 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Bikas Saha Assignee: Omkar Vinit Joshi Priority: Minor Attachments: MAPREDUCE-3899-branch-0.23.patch, YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.2.patch, YARN-744.patch Looks like the lock taken in this is broken. It takes a lock on lastResponse object and then puts a new lastResponse object into the map. At this point a new thread entering this function will get a new lastResponse object and will be able to take its lock and enter the critical section. Presumably we want to limit one response per app attempt. So the lock could be taken on the ApplicationAttemptId key of the response map object. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1053) Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
[ https://issues.apache.org/jira/browse/YARN-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1053: Affects Version/s: 2.2.1 2.2.0 Diagnostic message from ContainerExitEvent is ignored in ContainerImpl -- Key: YARN-1053 URL: https://issues.apache.org/jira/browse/YARN-1053 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0, 2.2.1 Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Priority: Blocker Labels: newbie Fix For: 2.3.0, 2.2.1 Attachments: YARN-1053.20130809.patch If the container launch fails then we send ContainerExitEvent. This event contains exitCode and diagnostic message. Today we are ignoring diagnostic message while handling this event inside ContainerImpl. Fixing it as it is useful in diagnosing the failure. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.
[ https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826997#comment-13826997 ] Hadoop QA commented on YARN-744: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614693/YARN-744.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2486//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2486//console This message is automatically generated. Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated. - Key: YARN-744 URL: https://issues.apache.org/jira/browse/YARN-744 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Bikas Saha Assignee: Omkar Vinit Joshi Priority: Minor Attachments: MAPREDUCE-3899-branch-0.23.patch, YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.2.patch, YARN-744.patch Looks like the lock taken in this is broken. It takes a lock on lastResponse object and then puts a new lastResponse object into the map. At this point a new thread entering this function will get a new lastResponse object and will be able to take its lock and enter the critical section. Presumably we want to limit one response per app attempt. So the lock could be taken on the ApplicationAttemptId key of the response map object. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1425) TestRMRestart is failing on trunk
Omkar Vinit Joshi created YARN-1425: --- Summary: TestRMRestart is failing on trunk Key: YARN-1425 URL: https://issues.apache.org/jira/browse/YARN-1425 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi TestRMRestart is failing on trunk. Fixing it. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1425) TestRMRestart is failing on trunk
[ https://issues.apache.org/jira/browse/YARN-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1425: Attachment: error.log [issue was seen|https://builds.apache.org/job/PreCommit-YARN-Build/2486//testReport/org.apache.hadoop.yarn.server.resourcemanager/TestRMRestart/testRMRestartWaitForPreviousAMToFinish/] TestRMRestart is failing on trunk - Key: YARN-1425 URL: https://issues.apache.org/jira/browse/YARN-1425 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Attachments: error.log TestRMRestart is failing on trunk. Fixing it. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-744) Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated.
[ https://issues.apache.org/jira/browse/YARN-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827011#comment-13827011 ] Omkar Vinit Joshi commented on YARN-744: Test failure is not related to this. Opened ticket YARN-1425 to track this. Race condition in ApplicationMasterService.allocate .. It might process same allocate request twice resulting in additional containers getting allocated. - Key: YARN-744 URL: https://issues.apache.org/jira/browse/YARN-744 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Bikas Saha Assignee: Omkar Vinit Joshi Priority: Minor Attachments: MAPREDUCE-3899-branch-0.23.patch, YARN-744-20130711.1.patch, YARN-744-20130715.1.patch, YARN-744-20130726.1.patch, YARN-744.1.patch, YARN-744.2.patch, YARN-744.patch Looks like the lock taken in this is broken. It takes a lock on lastResponse object and then puts a new lastResponse object into the map. At this point a new thread entering this function will get a new lastResponse object and will be able to take its lock and enter the critical section. Presumably we want to limit one response per app attempt. So the lock could be taken on the ApplicationAttemptId key of the response map object. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1332) In TestAMRMClient, replace assertTrue with assertEquals where possible
[ https://issues.apache.org/jira/browse/YARN-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827032#comment-13827032 ] Sandy Ryza commented on YARN-1332: -- Thanks Sebastian. The last thing is that, for assertEquals where we're comparing a variable with a constant, we should be consistent with the order of arguments: {code} -assertTrue(amClient.ask.size() == 0); -assertTrue(amClient.release.size() == 0); +assertEquals(0, amClient.ask.size()); +assertEquals(amClient.release.size(), 0); {code} The expected value, i.e. the constant, should be the first argument. In TestAMRMClient, replace assertTrue with assertEquals where possible -- Key: YARN-1332 URL: https://issues.apache.org/jira/browse/YARN-1332 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sebastian Wong Priority: Minor Labels: newbie Attachments: YARN-1332-2.patch, YARN-1332.patch TestAMRMClient uses a lot of assertTrue(amClient.ask.size() == 0) where assertEquals(0, amClient.ask.size()) would make it easier to see why it's failing at a glance. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-954) [YARN-321] History Service should create the webUI and wire it to HistoryStorage
[ https://issues.apache.org/jira/browse/YARN-954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-954: - Attachment: YARN-954.5.patch Having launched web server locally. The new patch fixed some issues: 1. Making it be able to run as a whole, 2. Fixing the display bugs in the web pages 3. Fixing one bug in ApplicationHistoryManagerImpl 4. Enhance NavBlock a bit [YARN-321] History Service should create the webUI and wire it to HistoryStorage Key: YARN-954 URL: https://issues.apache.org/jira/browse/YARN-954 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-954-3.patch, YARN-954-v0.patch, YARN-954-v1.patch, YARN-954-v2.patch, YARN-954.4.patch, YARN-954.5.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-691) Invalid NaN values in Hadoop REST API JSON response
[ https://issues.apache.org/jira/browse/YARN-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen He updated YARN-691: - Attachment: YARN-691.patch Invalid NaN values in Hadoop REST API JSON response --- Key: YARN-691 URL: https://issues.apache.org/jira/browse/YARN-691 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.6, 2.0.4-alpha Reporter: Kendall Thrapp Assignee: Chen He Attachments: YARN-691.patch I've been occasionally coming across instances where Hadoop's Cluster Applications REST API (http://hadoop.apache.org/docs/r0.23.6/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_API) has returned JSON that PHP's json_decode function failed to parse. I've tracked the syntax error down to the presence of the unquoted word NaN appearing as a value in the JSON. For example: progress:NaN, NaN is not part of the JSON spec, so its presence renders the whole JSON string invalid. Hadoop needs to return something other than NaN in this case -- perhaps an empty string or the quoted string NaN. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1239) Save version information in the state store
[ https://issues.apache.org/jira/browse/YARN-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827045#comment-13827045 ] Jian He commented on YARN-1239: --- Quote from protocol buffer guide. {code} If you want your new buffers to be backwards-compatible, and your old buffers to be forward-compatible – and you almost certainly do want this – then there are some rules you need to follow. In the new version of the protocol buffer: you must not change the tag numbers of any existing fields. you must not add or delete any required fields. you may delete optional or repeated fields. you may add new optional or repeated fields but you must use fresh tag numbers (i.e. tag numbers that were never used in this protocol buffer, not even by deleted fields). {code} Save version information in the state store --- Key: YARN-1239 URL: https://issues.apache.org/jira/browse/YARN-1239 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Jian He Attachments: YARN-1239.1.patch, YARN-1239.2.patch, YARN-1239.3.patch, YARN-1239.patch When creating root dir for the first time we should write version 1. If root dir exists then we should check that the version in the state store matches the version from config. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1242) Script changes to start AHS as an individual process
[ https://issues.apache.org/jira/browse/YARN-1242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827047#comment-13827047 ] Zhijie Shen commented on YARN-1242: --- +1. The patch looks good to me. The comments are added in yarn-env.sh, but not in yarn-env.cmd. However, neither do RM and NM's comments. So I guess it should be fine. bq. 4. Would you please verify starting AHS locally, in particular starting AHS on the machine that RM is not there (including start it as daemon)? Then, you can verify whether AHS is completely independent of RM. bq. Done Thanks for verifying the script. So starting/stopping the AHS (as daemon or not daemon) works properly, right? Script changes to start AHS as an individual process Key: YARN-1242 URL: https://issues.apache.org/jira/browse/YARN-1242 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Mayank Bansal Attachments: YARN-1242-1.patch, YARN-1242-2.patch, YARN-1242-3.patch Add the command in yarn and yarn.cmd to start and stop AHS -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1427) yarn-env.cmd should have the analog comments that are in yarn-env.sh
Zhijie Shen created YARN-1427: - Summary: yarn-env.cmd should have the analog comments that are in yarn-env.sh Key: YARN-1427 URL: https://issues.apache.org/jira/browse/YARN-1427 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen There're the paragraphs of about RM/NM env vars (probably AHS as well soon) in yarn-env.sh. Should the windows version script provide the similar comments? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1426) YARN Components need to unregister their beans upon shutdown
Jonathan Eagles created YARN-1426: - Summary: YARN Components need to unregister their beans upon shutdown Key: YARN-1426 URL: https://issues.apache.org/jira/browse/YARN-1426 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 3.0.0, 2.3.0 Reporter: Jonathan Eagles -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1425) TestRMRestart is failing on trunk
[ https://issues.apache.org/jira/browse/YARN-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827075#comment-13827075 ] Omkar Vinit Joshi commented on YARN-1425: - just discovered MockRM.waitForState(appAttempt, RMAppAttemptState)... simple ignores the passed in application attempt and always considers current application attempt. Fixing it. *RMAppAttempt attempt = app.getCurrentAppAttempt();* {code} public void waitForState(ApplicationAttemptId attemptId, RMAppAttemptState finalState) throws Exception { RMApp app = getRMContext().getRMApps().get(attemptId.getApplicationId()); Assert.assertNotNull(app shouldn't be null, app); RMAppAttempt attempt = app.getCurrentAppAttempt(); int timeoutSecs = 0; while (!finalState.equals(attempt.getAppAttemptState()) timeoutSecs++ 40) { System.out.println(AppAttempt : + attemptId + State is : + attempt.getAppAttemptState() + Waiting for state : + finalState); Thread.sleep(1000); } System.out.println(Attempt State is : + attempt.getAppAttemptState()); Assert.assertEquals(Attempt state is not correct (timedout), finalState, attempt.getAppAttemptState()); } {code} TestRMRestart is failing on trunk - Key: YARN-1425 URL: https://issues.apache.org/jira/browse/YARN-1425 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Attachments: error.log TestRMRestart is failing on trunk. Fixing it. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-954) [YARN-321] History Service should create the webUI and wire it to HistoryStorage
[ https://issues.apache.org/jira/browse/YARN-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827085#comment-13827085 ] Hadoop QA commented on YARN-954: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614717/YARN-954.5.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2487//console This message is automatically generated. [YARN-321] History Service should create the webUI and wire it to HistoryStorage Key: YARN-954 URL: https://issues.apache.org/jira/browse/YARN-954 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: YARN-954-3.patch, YARN-954-v0.patch, YARN-954-v1.patch, YARN-954-v2.patch, YARN-954.4.patch, YARN-954.5.patch -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-786) Expose application resource usage in RM REST API
[ https://issues.apache.org/jira/browse/YARN-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827087#comment-13827087 ] Hudson commented on YARN-786: - SUCCESS: Integrated in Hadoop-trunk-Commit #4762 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4762/]) YARN-786: Addendum so that RMAppAttemptImpl#getApplicationResourceUsageReport won't return null (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543597) * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java Expose application resource usage in RM REST API Key: YARN-786 URL: https://issues.apache.org/jira/browse/YARN-786 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.4-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.3.0 Attachments: YARN-786-1.patch, YARN-786-2.patch, YARN-786-addendum.patch, YARN-786.patch It might be good to require users to explicitly ask for this information, as it's a little more expensive to collect than the other fields in AppInfo. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1425) TestRMRestart is failing on trunk
[ https://issues.apache.org/jira/browse/YARN-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1425: Attachment: YARN-1425.1.patch TestRMRestart is failing on trunk - Key: YARN-1425 URL: https://issues.apache.org/jira/browse/YARN-1425 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Attachments: YARN-1425.1.patch, error.log TestRMRestart is failing on trunk. Fixing it. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-691) Invalid NaN values in Hadoop REST API JSON response
[ https://issues.apache.org/jira/browse/YARN-691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827124#comment-13827124 ] Hadoop QA commented on YARN-691: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614716/YARN-691.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2488//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-YARN-Build/2488//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2488//console This message is automatically generated. Invalid NaN values in Hadoop REST API JSON response --- Key: YARN-691 URL: https://issues.apache.org/jira/browse/YARN-691 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.6, 2.0.4-alpha Reporter: Kendall Thrapp Assignee: Chen He Fix For: 2.3.0 Attachments: YARN-691.patch I've been occasionally coming across instances where Hadoop's Cluster Applications REST API (http://hadoop.apache.org/docs/r0.23.6/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_API) has returned JSON that PHP's json_decode function failed to parse. I've tracked the syntax error down to the presence of the unquoted word NaN appearing as a value in the JSON. For example: progress:NaN, NaN is not part of the JSON spec, so its presence renders the whole JSON string invalid. Hadoop needs to return something other than NaN in this case -- perhaps an empty string or the quoted string NaN. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1407) RM Web UI and REST APIs should uniformly use YarnApplicationState
[ https://issues.apache.org/jira/browse/YARN-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827164#comment-13827164 ] Alejandro Abdelnur commented on YARN-1407: -- LGTM, +1 RM Web UI and REST APIs should uniformly use YarnApplicationState - Key: YARN-1407 URL: https://issues.apache.org/jira/browse/YARN-1407 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1407-1.patch, YARN-1407-2.patch, YARN-1407.patch RMAppState isn't a public facing enum like YarnApplicationState, so we shouldn't return values or list filters that come from it. However, some Blocks and AppInfo are still using RMAppState. It is not 100% clear to me whether or not fixing this would be a backwards-incompatible change. The change would only reduce the set of possible strings that the API returns, so I think not. We have also been changing the contents of RMAppState since 2.2.0, e.g. in YARN-891. It would still be good to fix this ASAP (i.e. for 2.2.1). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1303) Allow multiple commands separating with ; in distributed-shell
[ https://issues.apache.org/jira/browse/YARN-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1303: Attachment: YARN-1303.8.patch Allow multiple commands separating with ; in distributed-shell Key: YARN-1303 URL: https://issues.apache.org/jira/browse/YARN-1303 Project: Hadoop YARN Issue Type: Improvement Components: applications/distributed-shell Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.2.1 Attachments: YARN-1303.1.patch, YARN-1303.2.patch, YARN-1303.3.patch, YARN-1303.3.patch, YARN-1303.4.patch, YARN-1303.4.patch, YARN-1303.5.patch, YARN-1303.6.patch, YARN-1303.7.patch, YARN-1303.8.patch In shell, we can do ls; ls to run 2 commands at once. In distributed shell, this is not working. We should improve to allow this to occur. There are practical use cases that I know of to run multiple commands or to set environment variables before a command. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1303) Allow multiple commands separating with ; in distributed-shell
[ https://issues.apache.org/jira/browse/YARN-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827184#comment-13827184 ] Xuan Gong commented on YARN-1303: - We create a file that will save all the client's input command(from --shell_command). The AM will read all the commands, and add them into CLC. The idea is that we try to let all containers run the exactly the same commands that client gives, and let clients to figure out when and where to do the correct escaping staff. Allow multiple commands separating with ; in distributed-shell Key: YARN-1303 URL: https://issues.apache.org/jira/browse/YARN-1303 Project: Hadoop YARN Issue Type: Improvement Components: applications/distributed-shell Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.2.1 Attachments: YARN-1303.1.patch, YARN-1303.2.patch, YARN-1303.3.patch, YARN-1303.3.patch, YARN-1303.4.patch, YARN-1303.4.patch, YARN-1303.5.patch, YARN-1303.6.patch, YARN-1303.7.patch, YARN-1303.8.patch In shell, we can do ls; ls to run 2 commands at once. In distributed shell, this is not working. We should improve to allow this to occur. There are practical use cases that I know of to run multiple commands or to set environment variables before a command. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1407) RM Web UI and REST APIs should uniformly use YarnApplicationState
[ https://issues.apache.org/jira/browse/YARN-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827190#comment-13827190 ] Hudson commented on YARN-1407: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4764 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4764/]) YARN-1407. RM Web UI and REST APIs should uniformly use YarnApplicationState (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543675) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppsBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/FairSchedulerAppsBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/NavBlock.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RmController.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/AppInfo.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationsmanager/MockAsm.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebApp.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerRest.apt.vm RM Web UI and REST APIs should uniformly use YarnApplicationState - Key: YARN-1407 URL: https://issues.apache.org/jira/browse/YARN-1407 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.2.1 Attachments: YARN-1407-1.patch, YARN-1407-2.patch, YARN-1407.patch RMAppState isn't a public facing enum like YarnApplicationState, so we shouldn't return values or list filters that come from it. However, some Blocks and AppInfo are still using RMAppState. It is not 100% clear to me whether or not fixing this would be a backwards-incompatible change. The change would only reduce the set of possible strings that the API returns, so I think not. We have also been changing the contents of RMAppState since 2.2.0, e.g. in YARN-891. It would still be good to fix this ASAP (i.e. for 2.2.1). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1425) TestRMRestart is failing on trunk
[ https://issues.apache.org/jira/browse/YARN-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827193#comment-13827193 ] Hadoop QA commented on YARN-1425: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614736/YARN-1425.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2489//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2489//console This message is automatically generated. TestRMRestart is failing on trunk - Key: YARN-1425 URL: https://issues.apache.org/jira/browse/YARN-1425 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Attachments: YARN-1425.1.patch, error.log TestRMRestart is failing on trunk. Fixing it. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1303) Allow multiple commands separating with ; in distributed-shell
[ https://issues.apache.org/jira/browse/YARN-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827200#comment-13827200 ] Hadoop QA commented on YARN-1303: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614750/YARN-1303.8.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell: org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2490//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2490//console This message is automatically generated. Allow multiple commands separating with ; in distributed-shell Key: YARN-1303 URL: https://issues.apache.org/jira/browse/YARN-1303 Project: Hadoop YARN Issue Type: Improvement Components: applications/distributed-shell Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.2.1 Attachments: YARN-1303.1.patch, YARN-1303.2.patch, YARN-1303.3.patch, YARN-1303.3.patch, YARN-1303.4.patch, YARN-1303.4.patch, YARN-1303.5.patch, YARN-1303.6.patch, YARN-1303.7.patch, YARN-1303.8.patch In shell, we can do ls; ls to run 2 commands at once. In distributed shell, this is not working. We should improve to allow this to occur. There are practical use cases that I know of to run multiple commands or to set environment variables before a command. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1407) RM Web UI and REST APIs should uniformly use YarnApplicationState
[ https://issues.apache.org/jira/browse/YARN-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827202#comment-13827202 ] Hudson commented on YARN-1407: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4765 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4765/]) Move YARN-1407 under 2.2.1 in CHANGES.txt (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1543681) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt RM Web UI and REST APIs should uniformly use YarnApplicationState - Key: YARN-1407 URL: https://issues.apache.org/jira/browse/YARN-1407 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.2.1 Attachments: YARN-1407-1.patch, YARN-1407-2.patch, YARN-1407.patch RMAppState isn't a public facing enum like YarnApplicationState, so we shouldn't return values or list filters that come from it. However, some Blocks and AppInfo are still using RMAppState. It is not 100% clear to me whether or not fixing this would be a backwards-incompatible change. The change would only reduce the set of possible strings that the API returns, so I think not. We have also been changing the contents of RMAppState since 2.2.0, e.g. in YARN-891. It would still be good to fix this ASAP (i.e. for 2.2.1). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1303) Allow multiple commands separating with ; in distributed-shell
[ https://issues.apache.org/jira/browse/YARN-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-1303: Attachment: YARN-1303.8.1.patch Allow multiple commands separating with ; in distributed-shell Key: YARN-1303 URL: https://issues.apache.org/jira/browse/YARN-1303 Project: Hadoop YARN Issue Type: Improvement Components: applications/distributed-shell Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.2.1 Attachments: YARN-1303.1.patch, YARN-1303.2.patch, YARN-1303.3.patch, YARN-1303.3.patch, YARN-1303.4.patch, YARN-1303.4.patch, YARN-1303.5.patch, YARN-1303.6.patch, YARN-1303.7.patch, YARN-1303.8.1.patch, YARN-1303.8.patch In shell, we can do ls; ls to run 2 commands at once. In distributed shell, this is not working. We should improve to allow this to occur. There are practical use cases that I know of to run multiple commands or to set environment variables before a command. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1303) Allow multiple commands separating with ; in distributed-shell
[ https://issues.apache.org/jira/browse/YARN-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827214#comment-13827214 ] Hadoop QA commented on YARN-1303: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614756/YARN-1303.8.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2492//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2492//console This message is automatically generated. Allow multiple commands separating with ; in distributed-shell Key: YARN-1303 URL: https://issues.apache.org/jira/browse/YARN-1303 Project: Hadoop YARN Issue Type: Improvement Components: applications/distributed-shell Reporter: Tassapol Athiapinya Assignee: Xuan Gong Fix For: 2.2.1 Attachments: YARN-1303.1.patch, YARN-1303.2.patch, YARN-1303.3.patch, YARN-1303.3.patch, YARN-1303.4.patch, YARN-1303.4.patch, YARN-1303.5.patch, YARN-1303.6.patch, YARN-1303.7.patch, YARN-1303.8.1.patch, YARN-1303.8.patch In shell, we can do ls; ls to run 2 commands at once. In distributed shell, this is not working. We should improve to allow this to occur. There are practical use cases that I know of to run multiple commands or to set environment variables before a command. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1407) RM Web UI and REST APIs should uniformly use YarnApplicationState
[ https://issues.apache.org/jira/browse/YARN-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827219#comment-13827219 ] Vinod Kumar Vavilapalli commented on YARN-1407: --- Sandy, YARN-936 existed before. Is that a duplicate? RM Web UI and REST APIs should uniformly use YarnApplicationState - Key: YARN-1407 URL: https://issues.apache.org/jira/browse/YARN-1407 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.2.1 Attachments: YARN-1407-1.patch, YARN-1407-2.patch, YARN-1407.patch RMAppState isn't a public facing enum like YarnApplicationState, so we shouldn't return values or list filters that come from it. However, some Blocks and AppInfo are still using RMAppState. It is not 100% clear to me whether or not fixing this would be a backwards-incompatible change. The change would only reduce the set of possible strings that the API returns, so I think not. We have also been changing the contents of RMAppState since 2.2.0, e.g. in YARN-891. It would still be good to fix this ASAP (i.e. for 2.2.1). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1181) Augment MiniYARNCluster to support HA mode
[ https://issues.apache.org/jira/browse/YARN-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827224#comment-13827224 ] Hadoop QA commented on YARN-1181: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12610205/yarn-1181-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart org.apache.hadoop.yarn.server.TestMiniYARNClusterForHA {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2491//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2491//console This message is automatically generated. Augment MiniYARNCluster to support HA mode -- Key: YARN-1181 URL: https://issues.apache.org/jira/browse/YARN-1181 Project: Hadoop YARN Issue Type: Sub-task Reporter: Karthik Kambatla Assignee: Karthik Kambatla Attachments: yarn-1181-1.patch, yarn-1181-2.patch MiniYARNHACluster, along the lines of MiniYARNCluster, is needed for end-to-end HA tests. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (YARN-1428) RM cannot write the final state or RMApp/RMAppAttempt in the transition to the final state
Zhijie Shen created YARN-1428: - Summary: RM cannot write the final state or RMApp/RMAppAttempt in the transition to the final state Key: YARN-1428 URL: https://issues.apache.org/jira/browse/YARN-1428 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen ApplicationFinishData and ApplicationAttemptFinishData are written in the final transitions of RMApp/RMAppAttempt respectively. However, in the transitions, getState() is not getting the state that RMApp/RMAppAttempt is going to enter, but prior one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1428) RM cannot write the final state or RMApp/RMAppAttempt in the transition to the final state
[ https://issues.apache.org/jira/browse/YARN-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827243#comment-13827243 ] Hadoop QA commented on YARN-1428: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614767/YARN-1428.1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2493//console This message is automatically generated. RM cannot write the final state or RMApp/RMAppAttempt in the transition to the final state -- Key: YARN-1428 URL: https://issues.apache.org/jira/browse/YARN-1428 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-1428.1.patch ApplicationFinishData and ApplicationAttemptFinishData are written in the final transitions of RMApp/RMAppAttempt respectively. However, in the transitions, getState() is not getting the state that RMApp/RMAppAttempt is going to enter, but prior one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1423) Support queue placement by secondary group in the Fair Scheduler
[ https://issues.apache.org/jira/browse/YARN-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827251#comment-13827251 ] Ted Malaska commented on YARN-1423: --- I've got the changes on my local and they are building. I'll do some testing tomorrow and submit the patch. Thanks for the jira. Support queue placement by secondary group in the Fair Scheduler Key: YARN-1423 URL: https://issues.apache.org/jira/browse/YARN-1423 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Sandy Ryza -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (YARN-1363) Get / Cancel / Renew delegation token api should be non blocking
[ https://issues.apache.org/jira/browse/YARN-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Omkar Vinit Joshi updated YARN-1363: Attachment: YARN-1363.1.patch Work in progress patch.. YARN-1363.1.patch Get / Cancel / Renew delegation token api should be non blocking Key: YARN-1363 URL: https://issues.apache.org/jira/browse/YARN-1363 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Attachments: YARN-1363.1.patch Today GetDelgationToken, CancelDelegationToken and RenewDelegationToken are all blocking apis. * As a part of these calls we try to update RMStateStore and that may slow it down. * Now as we have limited number of client request handlers we may fill up client handlers quickly. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1053) Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
[ https://issues.apache.org/jira/browse/YARN-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827285#comment-13827285 ] Bikas Saha commented on YARN-1053: -- lets add add a null check or else literally null will be printed if the diagnostics was null Diagnostic message from ContainerExitEvent is ignored in ContainerImpl -- Key: YARN-1053 URL: https://issues.apache.org/jira/browse/YARN-1053 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0, 2.2.1 Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Priority: Blocker Labels: newbie Fix For: 2.3.0, 2.2.1 Attachments: YARN-1053.20130809.patch If the container launch fails then we send ContainerExitEvent. This event contains exitCode and diagnostic message. Today we are ignoring diagnostic message while handling this event inside ContainerImpl. Fixing it as it is useful in diagnosing the failure. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1053) Diagnostic message from ContainerExitEvent is ignored in ContainerImpl
[ https://issues.apache.org/jira/browse/YARN-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827296#comment-13827296 ] Bikas Saha commented on YARN-1053: -- If possible please modify some existing test or add a simple test. This will ensure that in future noone can inadvertently cause a regression that you spent a lot of time in debugging. Diagnostic message from ContainerExitEvent is ignored in ContainerImpl -- Key: YARN-1053 URL: https://issues.apache.org/jira/browse/YARN-1053 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.2.0, 2.2.1 Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Priority: Blocker Labels: newbie Fix For: 2.3.0, 2.2.1 Attachments: YARN-1053.20130809.patch If the container launch fails then we send ContainerExitEvent. This event contains exitCode and diagnostic message. Today we are ignoring diagnostic message while handling this event inside ContainerImpl. Fixing it as it is useful in diagnosing the failure. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1428) RM cannot write the final state or RMApp/RMAppAttempt in the transition to the final state
[ https://issues.apache.org/jira/browse/YARN-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827332#comment-13827332 ] Jian He commented on YARN-1428: --- [~zjshen], RMAppImpl.targetedFinalState is the state that's going to be saved in the state store, similarly for RMAppAttemptImpl RM cannot write the final state or RMApp/RMAppAttempt in the transition to the final state -- Key: YARN-1428 URL: https://issues.apache.org/jira/browse/YARN-1428 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-1428.1.patch ApplicationFinishData and ApplicationAttemptFinishData are written in the final transitions of RMApp/RMAppAttempt respectively. However, in the transitions, getState() is not getting the state that RMApp/RMAppAttempt is going to enter, but prior one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1425) TestRMRestart is failing on trunk
[ https://issues.apache.org/jira/browse/YARN-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827341#comment-13827341 ] Bikas Saha commented on YARN-1425: -- Can you please do a full test run for all tests to make sure this does not cause other tests to break. Thanks! TestRMRestart is failing on trunk - Key: YARN-1425 URL: https://issues.apache.org/jira/browse/YARN-1425 Project: Hadoop YARN Issue Type: Bug Reporter: Omkar Vinit Joshi Assignee: Omkar Vinit Joshi Attachments: YARN-1425.1.patch, error.log TestRMRestart is failing on trunk. Fixing it. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1428) RM cannot write the final state or RMApp/RMAppAttempt in the transition to the final state
[ https://issues.apache.org/jira/browse/YARN-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827345#comment-13827345 ] Zhijie Shen commented on YARN-1428: --- The problem described here is irrelevant to RM Restart (in particular, YARN-891). Let me clarify the problem more: Previously, I invoke RMApplicationHistoryWriter#applicationFinished/applicationAttemptFinished in the transition to the final state of RMApp/RMAppAttempt. However, during this transition, #getState() is not getting the final state, but the one before RMApp/RMAppAttempt enters the final state. Therefore, RMApp/RMAppAttempt is going to write non-final state into application history store, which is not expected. I'm targeting to fix this problem by telling RMApplicationHistoryWriter what the final state is going to be, and RMApplicationHistoryWriter writes this state instead. By the way, before touching the RM integration, we've already merge the latest changes (including YARN-891) into YARN-321 branch. Therefore, YARN-953's patch and this one are already on top of newest RMApp/RMAppAttempt's transitions. RM cannot write the final state or RMApp/RMAppAttempt in the transition to the final state -- Key: YARN-1428 URL: https://issues.apache.org/jira/browse/YARN-1428 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Zhijie Shen Attachments: YARN-1428.1.patch ApplicationFinishData and ApplicationAttemptFinishData are written in the final transitions of RMApp/RMAppAttempt respectively. However, in the transitions, getState() is not getting the state that RMApp/RMAppAttempt is going to enter, but prior one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (YARN-1420) TestRMContainerAllocator#testUpdatedNodes fails
[ https://issues.apache.org/jira/browse/YARN-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827378#comment-13827378 ] Jonathan Eagles commented on YARN-1420: --- I ran git bisect on my mac using jdk 1.6 to detect when this test failures was introduced. YARN-1343 is the likely culprit. I haven't run this test on linux with jdk 1.6, but I suspect there are in fact two issues. TestRMContainerAllocator#testUpdatedNodes fails --- Key: YARN-1420 URL: https://issues.apache.org/jira/browse/YARN-1420 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu From https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1607/console : {code} Running org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 65.78 sec FAILURE! - in org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator testUpdatedNodes(org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator) Time elapsed: 3.125 sec FAILURE! junit.framework.AssertionFailedError: null at junit.framework.Assert.fail(Assert.java:48) at junit.framework.Assert.assertTrue(Assert.java:20) at junit.framework.Assert.assertTrue(Assert.java:27) at org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator.testUpdatedNodes(TestRMContainerAllocator.java:779) {code} This assertion fails: {code} Assert.assertTrue(allocator.getJobUpdatedNodeEvents().isEmpty()); {code} The List returned by allocator.getJobUpdatedNodeEvents() is: [EventType: JOB_UPDATED_NODES] -- This message was sent by Atlassian JIRA (v6.1#6144)