[jira] [Updated] (YARN-10839) queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps setting to this value ignoring any individually overriden maxRunningApps setting for child queue
[ https://issues.apache.org/jira/browse/YARN-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated YARN-10839: --- Labels: scheduler (was: ) > queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps > setting to this value ignoring any individually overriden maxRunningApps > setting for child queues in FairScheduler > > > Key: YARN-10839 > URL: https://issues.apache.org/jira/browse/YARN-10839 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.5, 3.3.1 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Labels: scheduler > > [queueMaxAppsDefault|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Allocation_file_format] > sets the default running app limit for queues (including the root queue) > which can be overridden by individual child queues through the maxRunningApps > setting. > Consider a simple FairScheduler XML as follows: > {code} > > > > 1.0 > drf > * > * > > 1.0 > drf > > > 1024000 mb, 1000 vcores > 15 > 2.0 > drf > > > 512000 mb, 500 vcores > 10 > 1.0 > drf > > > 3 > drf > > > > > > {code} > Here: > * {{queueMaxAppsDefault}} is set to 3 {{maxRunningApps}} by default. > * root queue does not have any maxRunningApps limit set, > * maxRunningApps for child queues - root.A is 15 and for root.B is 10. > From above, if users wants to submit jobs to root.B, they are (incorrectly) > capped to 3, not 15 because the root queue (parent) itself is capped to 3 > because of the queueMaxAppsDefault setting. > Users' observations are thus seeing their apps stuck in ACCEPTED state. > Either the above FairScheduler XML should have been rejected by the > ResourceManager, or, the root queue should have been capped to the maximum > maxRunningApps setting defined for a leaf queue. > Possible solution -> If root queue has no maxRunningApps set and > queueMaxAppsDefault is set to a lower value than maxRunningApps for an > individual leaf queue, then, the root queue should implicitly be capped to > the latter, instead of queueMaxAppsDefault. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10839) queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps setting to this value ignoring any individually overriden maxRunningApps setting for child queue
[ https://issues.apache.org/jira/browse/YARN-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated YARN-10839: --- Affects Version/s: 2.7.5 3.3.1 > queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps > setting to this value ignoring any individually overriden maxRunningApps > setting for child queues in FairScheduler > > > Key: YARN-10839 > URL: https://issues.apache.org/jira/browse/YARN-10839 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.5, 3.3.1 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > > [queueMaxAppsDefault|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Allocation_file_format] > sets the default running app limit for queues (including the root queue) > which can be overridden by individual child queues through the maxRunningApps > setting. > Consider a simple FairScheduler XML as follows: > {code} > > > > 1.0 > drf > * > * > > 1.0 > drf > > > 1024000 mb, 1000 vcores > 15 > 2.0 > drf > > > 512000 mb, 500 vcores > 10 > 1.0 > drf > > > 3 > drf > > > > > > {code} > Here: > * {{queueMaxAppsDefault}} is set to 3 {{maxRunningApps}} by default. > * root queue does not have any maxRunningApps limit set, > * maxRunningApps for child queues - root.A is 15 and for root.B is 10. > From above, if users wants to submit jobs to root.B, they are (incorrectly) > capped to 3, not 15 because the root queue (parent) itself is capped to 3 > because of the queueMaxAppsDefault setting. > Users' observations are thus seeing their apps stuck in ACCEPTED state. > Either the above FairScheduler XML should have been rejected by the > ResourceManager, or, the root queue should have been capped to the maximum > maxRunningApps setting defined for a leaf queue. > Possible solution -> If root queue has no maxRunningApps set and > queueMaxAppsDefault is set to a lower value than maxRunningApps for an > individual leaf queue, then, the root queue should implicitly be capped to > the latter, instead of queueMaxAppsDefault. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10839) queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps setting to this value ignoring any individually overriden maxRunningApps setting for child queue
[ https://issues.apache.org/jira/browse/YARN-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja updated YARN-10839: --- Component/s: yarn > queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps > setting to this value ignoring any individually overriden maxRunningApps > setting for child queues in FairScheduler > > > Key: YARN-10839 > URL: https://issues.apache.org/jira/browse/YARN-10839 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.5, 3.3.1 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > > [queueMaxAppsDefault|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Allocation_file_format] > sets the default running app limit for queues (including the root queue) > which can be overridden by individual child queues through the maxRunningApps > setting. > Consider a simple FairScheduler XML as follows: > {code} > > > > 1.0 > drf > * > * > > 1.0 > drf > > > 1024000 mb, 1000 vcores > 15 > 2.0 > drf > > > 512000 mb, 500 vcores > 10 > 1.0 > drf > > > 3 > drf > > > > > > {code} > Here: > * {{queueMaxAppsDefault}} is set to 3 {{maxRunningApps}} by default. > * root queue does not have any maxRunningApps limit set, > * maxRunningApps for child queues - root.A is 15 and for root.B is 10. > From above, if users wants to submit jobs to root.B, they are (incorrectly) > capped to 3, not 15 because the root queue (parent) itself is capped to 3 > because of the queueMaxAppsDefault setting. > Users' observations are thus seeing their apps stuck in ACCEPTED state. > Either the above FairScheduler XML should have been rejected by the > ResourceManager, or, the root queue should have been capped to the maximum > maxRunningApps setting defined for a leaf queue. > Possible solution -> If root queue has no maxRunningApps set and > queueMaxAppsDefault is set to a lower value than maxRunningApps for an > individual leaf queue, then, the root queue should implicitly be capped to > the latter, instead of queueMaxAppsDefault. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org