[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016812#comment-14016812 ] Hudson commented on YARN-1913: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1790 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1790/]) YARN-1913. With Fair Scheduler, cluster can logjam when all resources are consumed by AMs (Wei Yan via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1599400) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Fix For: 2.5.0 > > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016625#comment-14016625 ] Hudson commented on YARN-1913: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1763 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1763/]) YARN-1913. With Fair Scheduler, cluster can logjam when all resources are consumed by AMs (Wei Yan via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1599400) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Fix For: 2.5.0 > > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016407#comment-14016407 ] Hudson commented on YARN-1913: -- FAILURE: Integrated in Hadoop-Yarn-trunk #572 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/572/]) YARN-1913. With Fair Scheduler, cluster can logjam when all resources are consumed by AMs (Wei Yan via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1599400) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Fix For: 2.5.0 > > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14016116#comment-14016116 ] Hudson commented on YARN-1913: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5646 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5646/]) YARN-1913. With Fair Scheduler, cluster can logjam when all resources are consumed by AMs (Wei Yan via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1599400) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerTestBase.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAllocationFileLoaderService.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Fix For: 2.5.0 > > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015896#comment-14015896 ] Hadoop QA commented on YARN-1913: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647976/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3892//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3892//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015879#comment-14015879 ] Hadoop QA commented on YARN-1913: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647969/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestMaxRunningAppsEnforcer org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSSchedulerApp org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerApplicationAttempt {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3891//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3891//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015823#comment-14015823 ] Hadoop QA commented on YARN-1913: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647968/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestMaxRunningAppsEnforcer org.apache.hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerApplicationAttempt org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSSchedulerApp {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3890//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3890//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015800#comment-14015800 ] Sandy Ryza commented on YARN-1913: -- The "M" un Unmanaged AM shouldn't be capitalized. Otherwise, the patch LGTM. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015536#comment-14015536 ] Wei Yan commented on YARN-1913: --- Thanks, Sandy. One problem may exist if we use SchedulerApplicationAttempt.getLiveContainers().isEmpty(), if the application is unManagedAM, it will not generate an AM resource request. Thus, the first request would be an actual task, not an AM. Correct me if I'm wrong here. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015518#comment-14015518 ] Sandy Ryza commented on YARN-1913: -- This is looking good. A small things. AppSchedulingInfo is only used to track pending resources. We should hold amResource in SchedulerApplicationAttempt. {code} + if (! queue.canRunAppAM(app.getAMResource())) { {code} Take out space after exclamation point. {code} @Override + public boolean checkIfAMResourceUsageOverLimit(Resource usage, Resource maxAMResource) { +return Resources.greaterThan(RESOURCE_CALCULATOR, null, usage, maxAMResource); + } {code} Simpler to just use "usage.getMemory() > maxAMResource.getMemory()". {code} + if (request.getPriority().equals(RMAppAttemptImpl.AM_CONTAINER_PRIORITY)) { {code} I'm a little nervous about using the priority here because apps could unwittingly submit all requests at that priority. Can we use SchedulerApplicationAttempt.getLiveContainers().isEmpty()? > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015442#comment-14015442 ] Hadoop QA commented on YARN-1913: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647905/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3887//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3887//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015300#comment-14015300 ] Hadoop QA commented on YARN-1913: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647873/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3884//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/3884//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3884//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014716#comment-14014716 ] Sandy Ryza commented on YARN-1913: -- The primary benefit of the logic in MaxRunningAppsEnforcer is that it allows us to enforce maxRunningApps constraints from queues higher up in the hierarchy, and integrate these with user maxRunningApps constraints As we won't have these issues for queue maxAMShares, I think we can avoid touching MaxRunningAppsEnforcer entirely and just do the checking inside AppSchedulable.assignContainer or FSLeafQueue.assignContainer. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014447#comment-14014447 ] Hadoop QA commented on YARN-1913: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647733/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3872//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3872//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014422#comment-14014422 ] Wei Yan commented on YARN-1913: --- Update a new patch to fix Sandy's comments. [~ashwinshankar77], if the leaf queue is not configured, the default AM resource limit is (leaf_queue_fair_share * 1.0f), still limited by its fair share. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014357#comment-14014357 ] Ashwin Shankar commented on YARN-1913: -- Hey [~sandyr], quick comment bq.I think it might make sense to only allow the queue-level maxAMShare on leaf queues for the moment. I can't think of a strong reason somebody would want to set it on a parent queue For NestedUserQueue rule, user queues would be created dynamically under a parent. For this use case, maxAMShare at the parent would be useful, since leaf user queues are not configured in the alloc xml. I see your point that it would complicate the logic at MaxRunningAppsEnforcer,but just wanted to bring this up in case you didn't consider this use case. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014308#comment-14014308 ] Wei Yan commented on YARN-1913: --- Thanks, Sandy. Will update a patch. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014304#comment-14014304 ] Sandy Ryza commented on YARN-1913: -- Thanks for the updated patch Wei. For queues, maxAMShare should be defined as a fraction of the queue's fair share, not maxShare. The majority of queues are configured with infinite maxResources. We need to be careful with this, as fair shares can change when queues are created dynamically. I think it might make sense to only allow the queue-level maxAMShare on leaf queues for the moment. I can't think of a strong reason somebody would want to set it on a parent queue, and doing this would allow us to avoid the complex logic in MaxRunningAppsEnforcer, and merely enforce the AM max share by checking in AppSchedulable.assignContainer. This is also what the Capacity Scheduler has at the moment. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013171#comment-14013171 ] Hadoop QA commented on YARN-1913: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647487/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3860//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3860//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14011587#comment-14011587 ] Sandy Ryza commented on YARN-1913: -- I think we should avoid doing approximate calculation through the minimum allocation. We need to handle situations where AM resources are much larger than the min, and situations where the minimum allocation will be 0 (common on Llama-enabled clusters). This would have the added benefit of avoiding touching the "runnability" machinery, which is already bordering on over-complicated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Labels: easyfix > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14010659#comment-14010659 ] Hadoop QA commented on YARN-1913: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12647029/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3842//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3842//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Labels: easyfix > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch, > YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007958#comment-14007958 ] Hadoop QA commented on YARN-1913: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646641/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3822//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3822//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007905#comment-14007905 ] Hadoop QA commented on YARN-1913: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12646404/YARN-1913.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3816//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3816//console This message is automatically generated. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong >Assignee: Wei Yan > Attachments: YARN-1913.patch, YARN-1913.patch > > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1913) With Fair Scheduler, cluster can logjam when all resources are consumed by AMs
[ https://issues.apache.org/jira/browse/YARN-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963164#comment-13963164 ] Sandy Ryza commented on YARN-1913: -- This is a Fair Scheduler issue. We need to add an equivalent property to the Fair Scheduler. > With Fair Scheduler, cluster can logjam when all resources are consumed by AMs > -- > > Key: YARN-1913 > URL: https://issues.apache.org/jira/browse/YARN-1913 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Affects Versions: 2.3.0 >Reporter: bc Wong > > It's possible to deadlock a cluster by submitting many applications at once, > and have all cluster resources taken up by AMs. > One solution is for the scheduler to limit resources taken up by AMs, as a > percentage of total cluster resources, via a "maxApplicationMasterShare" > config. -- This message was sent by Atlassian JIRA (v6.2#6252)