[jira] [Updated] (YARN-10839) queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps setting to this value ignoring any individually overriden maxRunningApps setting for child queue

2021-06-30 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10839:
---
Labels: scheduler  (was: )

> queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps 
> setting to this value ignoring any individually overriden maxRunningApps 
> setting for child queues in FairScheduler
> 
>
> Key: YARN-10839
> URL: https://issues.apache.org/jira/browse/YARN-10839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.5, 3.3.1
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Major
>  Labels: scheduler
>
> [queueMaxAppsDefault|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Allocation_file_format]
>  sets the default running app limit for queues (including the root queue) 
> which can be overridden by individual child queues through the maxRunningApps 
> setting.
> Consider a simple FairScheduler XML as follows:
> {code}
> 
> 
> 
> 1.0
> drf
> *
> *
> 
> 1.0
> drf
> 
> 
> 1024000 mb, 1000 vcores
> 15
> 2.0
> drf
> 
> 
> 512000 mb, 500 vcores
> 10
> 1.0
> drf
> 
> 
> 3
> drf
> 
> 
> 
> 
> 
> {code}
> Here:
> * {{queueMaxAppsDefault}} is set to 3 {{maxRunningApps}} by default.
> * root queue does not have any maxRunningApps limit set,
> * maxRunningApps for child queues - root.A is 15 and for root.B is 10.
> From above, if users wants to submit jobs to root.B, they are (incorrectly) 
> capped to 3, not 15 because the root queue (parent) itself is capped to 3 
> because of the queueMaxAppsDefault setting.
> Users' observations are thus seeing their apps stuck in ACCEPTED state.
> Either the above FairScheduler XML should have been rejected by the 
> ResourceManager, or, the root queue should have been capped to the maximum 
> maxRunningApps setting defined for a leaf queue. 
> Possible solution -> If root queue has no maxRunningApps set and 
> queueMaxAppsDefault is set to a lower value than maxRunningApps for an 
> individual leaf queue, then, the root queue should implicitly be capped to 
> the latter, instead of queueMaxAppsDefault.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10839) queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps setting to this value ignoring any individually overriden maxRunningApps setting for child queue

2021-06-30 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10839:
---
Affects Version/s: 2.7.5
   3.3.1

> queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps 
> setting to this value ignoring any individually overriden maxRunningApps 
> setting for child queues in FairScheduler
> 
>
> Key: YARN-10839
> URL: https://issues.apache.org/jira/browse/YARN-10839
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.5, 3.3.1
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Major
>
> [queueMaxAppsDefault|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Allocation_file_format]
>  sets the default running app limit for queues (including the root queue) 
> which can be overridden by individual child queues through the maxRunningApps 
> setting.
> Consider a simple FairScheduler XML as follows:
> {code}
> 
> 
> 
> 1.0
> drf
> *
> *
> 
> 1.0
> drf
> 
> 
> 1024000 mb, 1000 vcores
> 15
> 2.0
> drf
> 
> 
> 512000 mb, 500 vcores
> 10
> 1.0
> drf
> 
> 
> 3
> drf
> 
> 
> 
> 
> 
> {code}
> Here:
> * {{queueMaxAppsDefault}} is set to 3 {{maxRunningApps}} by default.
> * root queue does not have any maxRunningApps limit set,
> * maxRunningApps for child queues - root.A is 15 and for root.B is 10.
> From above, if users wants to submit jobs to root.B, they are (incorrectly) 
> capped to 3, not 15 because the root queue (parent) itself is capped to 3 
> because of the queueMaxAppsDefault setting.
> Users' observations are thus seeing their apps stuck in ACCEPTED state.
> Either the above FairScheduler XML should have been rejected by the 
> ResourceManager, or, the root queue should have been capped to the maximum 
> maxRunningApps setting defined for a leaf queue. 
> Possible solution -> If root queue has no maxRunningApps set and 
> queueMaxAppsDefault is set to a lower value than maxRunningApps for an 
> individual leaf queue, then, the root queue should implicitly be capped to 
> the latter, instead of queueMaxAppsDefault.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10839) queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps setting to this value ignoring any individually overriden maxRunningApps setting for child queue

2021-06-30 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10839:
---
Component/s: yarn

> queueMaxAppsDefault when set blindly caps the root queue's maxRunningApps 
> setting to this value ignoring any individually overriden maxRunningApps 
> setting for child queues in FairScheduler
> 
>
> Key: YARN-10839
> URL: https://issues.apache.org/jira/browse/YARN-10839
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.5, 3.3.1
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Major
>
> [queueMaxAppsDefault|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Allocation_file_format]
>  sets the default running app limit for queues (including the root queue) 
> which can be overridden by individual child queues through the maxRunningApps 
> setting.
> Consider a simple FairScheduler XML as follows:
> {code}
> 
> 
> 
> 1.0
> drf
> *
> *
> 
> 1.0
> drf
> 
> 
> 1024000 mb, 1000 vcores
> 15
> 2.0
> drf
> 
> 
> 512000 mb, 500 vcores
> 10
> 1.0
> drf
> 
> 
> 3
> drf
> 
> 
> 
> 
> 
> {code}
> Here:
> * {{queueMaxAppsDefault}} is set to 3 {{maxRunningApps}} by default.
> * root queue does not have any maxRunningApps limit set,
> * maxRunningApps for child queues - root.A is 15 and for root.B is 10.
> From above, if users wants to submit jobs to root.B, they are (incorrectly) 
> capped to 3, not 15 because the root queue (parent) itself is capped to 3 
> because of the queueMaxAppsDefault setting.
> Users' observations are thus seeing their apps stuck in ACCEPTED state.
> Either the above FairScheduler XML should have been rejected by the 
> ResourceManager, or, the root queue should have been capped to the maximum 
> maxRunningApps setting defined for a leaf queue. 
> Possible solution -> If root queue has no maxRunningApps set and 
> queueMaxAppsDefault is set to a lower value than maxRunningApps for an 
> individual leaf queue, then, the root queue should implicitly be capped to 
> the latter, instead of queueMaxAppsDefault.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org