[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state
[ https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13614420#comment-13614420 ] Hadoop QA commented on YARN-209: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12575556/YARN-209.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/605//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/605//console This message is automatically generated. Capacity scheduler can leave application in pending state - Key: YARN-209 URL: https://issues.apache.org/jira/browse/YARN-209 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Zhijie Shen Fix For: 3.0.0 Attachments: YARN-209.1.patch, YARN-209.2.patch, YARN-209.3.patch, YARN-209-test.patch Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever. This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state
[ https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13613203#comment-13613203 ] Zhijie Shen commented on YARN-209: -- @Hitesh, this is because in this ticket, there's no fix code. The failure is expected here. When YARN-474 is fixed, the failure should be gone. Capacity scheduler can leave application in pending state - Key: YARN-209 URL: https://issues.apache.org/jira/browse/YARN-209 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Zhijie Shen Fix For: 3.0.0 Attachments: YARN-209.1.patch, YARN-209.2.patch, YARN-209-test.patch Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever. This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state
[ https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13609303#comment-13609303 ] Zhijie Shen commented on YARN-209: -- Whenever more hardware is added to the cluster, LeafQueue#updateClusterResource will be triggered. The problem can be fixed by the patch for YARN-474 as well. Capacity scheduler can leave application in pending state - Key: YARN-209 URL: https://issues.apache.org/jira/browse/YARN-209 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Zhijie Shen Fix For: 3.0.0 Attachments: YARN-209.1.patch, YARN-209-test.patch Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever. This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state
[ https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13609601#comment-13609601 ] Hadoop QA commented on YARN-209: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12574896/YARN-209.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:red}-1 eclipse:eclipse{color}. The patch failed to build with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/564//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/564//console This message is automatically generated. Capacity scheduler can leave application in pending state - Key: YARN-209 URL: https://issues.apache.org/jira/browse/YARN-209 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Zhijie Shen Fix For: 3.0.0 Attachments: YARN-209.1.patch, YARN-209.2.patch, YARN-209-test.patch Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever. This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state
[ https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575473#comment-13575473 ] Bikas Saha commented on YARN-209: - Right. Thats what the test in the attached patch does. But its a functional test. Capacity scheduler can leave application in pending state - Key: YARN-209 URL: https://issues.apache.org/jira/browse/YARN-209 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 3.0.0 Attachments: YARN-209.1.patch, YARN-209-test.patch Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever. This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state
[ https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573188#comment-13573188 ] Vinod Kumar Vavilapalli commented on YARN-209: -- Haven't looked at the code yet, trying to understand the scenario. So, in other words, if an application gets submitted to the RM before any NM registered, the application will be stuck in pending state. Right? If so, we can write a test like that. Capacity scheduler can leave application in pending state - Key: YARN-209 URL: https://issues.apache.org/jira/browse/YARN-209 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 3.0.0 Attachments: YARN-209.1.patch, YARN-209-test.patch Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever. This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state
[ https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505651#comment-13505651 ] Hadoop QA commented on YARN-209: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12552973/YARN-209.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/176//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/176//console This message is automatically generated. Capacity scheduler can leave application in pending state - Key: YARN-209 URL: https://issues.apache.org/jira/browse/YARN-209 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 3.0.0 Attachments: YARN-209.1.patch, YARN-209-test.patch Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever. This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state
[ https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494622#comment-13494622 ] Bikas Saha commented on YARN-209: - Attaching a fix for the issue. Also attaching a test that times out without this fix and passes with it. There might be a simpler way to test this. Capacity scheduler can leave application in pending state - Key: YARN-209 URL: https://issues.apache.org/jira/browse/YARN-209 Project: Hadoop YARN Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Fix For: 3.0.0 Attachments: YARN-209.1.patch, YARN-209-test.patch Say application A is submitted but at that time it does not meet the bar for activation because of resource limit settings for applications. After that if more hardware is added to the system and the application becomes valid it still remains in pending state, likely forever. This might be rare to hit in real life because enough NM's heartbeat to the RM before applications can get submitted. But a change in settings or heartbeat interval might make it easier to repro. In RM restart scenarios, this will likely hit more if its implemented by re-playing events and re-submitting applications to the scheduler before the RPC to NM's is activated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira