[jira] [Commented] (YARN-3730) scheduler reserve more resource than required
[ https://issues.apache.org/jira/browse/YARN-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566919#comment-14566919 ] gu-chi commented on YARN-3730: -- Thx Naga, as improvements r not merged to my current using version, so this feature is not invoked, will set yarn.scheduler.capacity.reservations-continue-look-all-nodes to false on version 2.7.0 and check the outcome. scheduler reserve more resource than required - Key: YARN-3730 URL: https://issues.apache.org/jira/browse/YARN-3730 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: gu-chi Using capacity scheduler, environment is 3 NM and each has 9 vcores, I ran a spark task with 4 executors and each executor 5 cores, as suspected, only 1 executor not able to start and will be reserved, but actually more containers are reserved. This way, I can not run some other smaller tasks. As I checked the capacity scheduler, the 'needContainers' method in LeafQueue.java has a computation of 'starvation', this cause the scenario of more container reserved than required, any idea or suggestion on this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3730) scheduler reserve more resource than required
[ https://issues.apache.org/jira/browse/YARN-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566130#comment-14566130 ] Naganarasimha G R commented on YARN-3730: - hi [~gu chi], Which version did you find this problem ? If its below 2.6.0, please test with the latest as there have been some improvements wrt to reservation in YARN-1769. If its with 2.6.0 and above version, share some RM logs with debug enabled, so that we can do further analysis. scheduler reserve more resource than required - Key: YARN-3730 URL: https://issues.apache.org/jira/browse/YARN-3730 Project: Hadoop YARN Issue Type: Bug Components: scheduler Reporter: gu-chi Using capacity scheduler, environment is 3 NM and each has 9 vcores, I ran a spark task with 4 executors and each executor 5 cores, as suspected, only 1 executor not able to start and will be reserved, but actually more containers are reserved. This way, I can not run some other smaller tasks. As I checked the capacity scheduler, the 'needContainers' method in LeafQueue.java has a computation of 'starvation', this cause the scenario of more container reserved than required, any idea or suggestion on this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)