[jira] [Updated] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues

2018-09-21 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8804:
---
Attachment: YARN-8804.003.patch

> resourceLimits may be wrongly calculated when leaf-queue is blocked in 
> cluster with 3+ level queues
> ---
>
> Key: YARN-8804
> URL: https://issues.apache.org/jira/browse/YARN-8804
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Attachments: YARN-8804.001.patch, YARN-8804.002.patch, 
> YARN-8804.003.patch
>
>
> This problem is due to YARN-4280, parent queue will deduct child queue's 
> headroom when the child queue reached its resource limit and the skipped type 
> is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly 
> calculated, but for non-deepest parent queue, its headroom may be much more 
> than the sum of reached-limit child queues' headroom, so that the resource 
> limit of non-deepest parent may be much less than its true value and block 
> the allocation for later queues.
> To reproduce this problem with UT:
>  (1) Cluster has two nodes whose node resource both are <10GB, 10core> and 
> 3-level queues as below, among them max-capacity of "c1" is 10 and others are 
> all 100, so that max-capacity of queue "c1" is <2GB, 2core>
> {noformat}
>   Root
>  /  |  \
> a   bc
>10   20   70
>  |   \
> c1   c2
>   10(max=10) 90
> {noformat}
> (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1
>  (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1
>  (4) app1 and app2 both ask one <2GB, 1core> containers. 
>  (5) nm1 do 1 heartbeat
>  Now queue "c" has lower capacity percentage than queue "b", the allocation 
> sequence will be "a" -> "c" -> "b",
>  queue "c1" has reached queue limit so that requests of app1 should be 
> pending, 
>  headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), 
>  headroom of queue "c" is <18GB, 18core> (=max-capacity - used), 
>  after allocation for queue "c", resource limit of queue "b" will be wrongly 
> calculated as <2GB, 2core>,
>  headroom of queue "b" will be <1GB, 1core> (=resource-limit - used)
>  so that scheduler won't allocate one container for app2 on nm1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues

2018-09-21 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-8804:
-
Target Version/s: 2.10.0, 3.2.0, 2.9.2, 3.0.4, 3.1.2, 2.8.6

Thanks for updating the patch!  This is a good performance improvement.  
However I still think having the allocation directly track the amount relevant 
to an allocation blocked by queue limits would be cleaner.  It would remove the 
need to do RTTI on child queues.

But that's a much bigger change, and I'm OK with this approach for now.

Patch does not apply to trunk and needs to be rebased.  After doing so, please 
move the JIRA to Patch Available so Jenkins can comment on it.


> resourceLimits may be wrongly calculated when leaf-queue is blocked in 
> cluster with 3+ level queues
> ---
>
> Key: YARN-8804
> URL: https://issues.apache.org/jira/browse/YARN-8804
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Attachments: YARN-8804.001.patch, YARN-8804.002.patch
>
>
> This problem is due to YARN-4280, parent queue will deduct child queue's 
> headroom when the child queue reached its resource limit and the skipped type 
> is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly 
> calculated, but for non-deepest parent queue, its headroom may be much more 
> than the sum of reached-limit child queues' headroom, so that the resource 
> limit of non-deepest parent may be much less than its true value and block 
> the allocation for later queues.
> To reproduce this problem with UT:
>  (1) Cluster has two nodes whose node resource both are <10GB, 10core> and 
> 3-level queues as below, among them max-capacity of "c1" is 10 and others are 
> all 100, so that max-capacity of queue "c1" is <2GB, 2core>
> {noformat}
>   Root
>  /  |  \
> a   bc
>10   20   70
>  |   \
> c1   c2
>   10(max=10) 90
> {noformat}
> (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1
>  (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1
>  (4) app1 and app2 both ask one <2GB, 1core> containers. 
>  (5) nm1 do 1 heartbeat
>  Now queue "c" has lower capacity percentage than queue "b", the allocation 
> sequence will be "a" -> "c" -> "b",
>  queue "c1" has reached queue limit so that requests of app1 should be 
> pending, 
>  headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), 
>  headroom of queue "c" is <18GB, 18core> (=max-capacity - used), 
>  after allocation for queue "c", resource limit of queue "b" will be wrongly 
> calculated as <2GB, 2core>,
>  headroom of queue "b" will be <1GB, 1core> (=resource-limit - used)
>  so that scheduler won't allocate one container for app2 on nm1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues

2018-09-20 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8804:
---
Attachment: YARN-8804.002.patch

> resourceLimits may be wrongly calculated when leaf-queue is blocked in 
> cluster with 3+ level queues
> ---
>
> Key: YARN-8804
> URL: https://issues.apache.org/jira/browse/YARN-8804
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Attachments: YARN-8804.001.patch, YARN-8804.002.patch
>
>
> This problem is due to YARN-4280, parent queue will deduct child queue's 
> headroom when the child queue reached its resource limit and the skipped type 
> is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly 
> calculated, but for non-deepest parent queue, its headroom may be much more 
> than the sum of reached-limit child queues' headroom, so that the resource 
> limit of non-deepest parent may be much less than its true value and block 
> the allocation for later queues.
> To reproduce this problem with UT:
>  (1) Cluster has two nodes whose node resource both are <10GB, 10core> and 
> 3-level queues as below, among them max-capacity of "c1" is 10 and others are 
> all 100, so that max-capacity of queue "c1" is <2GB, 2core>
> {noformat}
>   Root
>  /  |  \
> a   bc
>10   20   70
>  |   \
> c1   c2
>   10(max=10) 90
> {noformat}
> (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1
>  (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1
>  (4) app1 and app2 both ask one <2GB, 1core> containers. 
>  (5) nm1 do 1 heartbeat
>  Now queue "c" has lower capacity percentage than queue "b", the allocation 
> sequence will be "a" -> "c" -> "b",
>  queue "c1" has reached queue limit so that requests of app1 should be 
> pending, 
>  headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), 
>  headroom of queue "c" is <18GB, 18core> (=max-capacity - used), 
>  after allocation for queue "c", resource limit of queue "b" will be wrongly 
> calculated as <2GB, 2core>,
>  headroom of queue "b" will be <1GB, 1core> (=resource-limit - used)
>  so that scheduler won't allocate one container for app2 on nm1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues

2018-09-20 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8804:
---
Attachment: YARN-8804.001.patch

> resourceLimits may be wrongly calculated when leaf-queue is blocked in 
> cluster with 3+ level queues
> ---
>
> Key: YARN-8804
> URL: https://issues.apache.org/jira/browse/YARN-8804
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
> Attachments: YARN-8804.001.patch
>
>
> This problem is due to YARN-4280, parent queue will deduct child queue's 
> headroom when the child queue reached its resource limit and the skipped type 
> is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly 
> calculated, but for non-deepest parent queue, its headroom may be much more 
> than the sum of reached-limit child queues' headroom, so that the resource 
> limit of non-deepest parent may be much less than its true value and block 
> the allocation for later queues.
> To reproduce this problem with UT:
>  (1) Cluster has two nodes whose node resource both are <10GB, 10core> and 
> 3-level queues as below, among them max-capacity of "c1" is 10 and others are 
> all 100, so that max-capacity of queue "c1" is <2GB, 2core>
> {noformat}
>   Root
>  /  |  \
> a   bc
>10   20   70
>  |   \
> c1   c2
>   10(max=10) 90
> {noformat}
> (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1
>  (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1
>  (4) app1 and app2 both ask one <2GB, 1core> containers. 
>  (5) nm1 do 1 heartbeat
>  Now queue "c" has lower capacity percentage than queue "b", the allocation 
> sequence will be "a" -> "c" -> "b",
>  queue "c1" has reached queue limit so that requests of app1 should be 
> pending, 
>  headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), 
>  headroom of queue "c" is <18GB, 18core> (=max-capacity - used), 
>  after allocation for queue "c", resource limit of queue "b" will be wrongly 
> calculated as <2GB, 2core>,
>  headroom of queue "b" will be <1GB, 1core> (=resource-limit - used)
>  so that scheduler won't allocate one container for app2 on nm1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues

2018-09-20 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8804:
---
Description: 
This problem is due to YARN-4280, parent queue will deduct child queue's 
headroom when the child queue reached its resource limit and the skipped type 
is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly 
calculated, but for non-deepest parent queue, its headroom may be much more 
than the sum of reached-limit child queues' headroom, so that the resource 
limit of non-deepest parent may be much less than its true value and block the 
allocation for later queues.

To reproduce this problem with UT:
 (1) Cluster has two nodes whose node resource both are <10GB, 10core> and 
3-level queues as below, among them max-capacity of "c1" is 10 and others are 
all 100, so that max-capacity of queue "c1" is <2GB, 2core>
{noformat}
  Root
 /  |  \
a   bc
   10   20   70
 |   \
c1   c2
  10(max=10) 90
{noformat}
(2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1
 (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1
 (4) app1 and app2 both ask one <2GB, 1core> containers. 
 (5) nm1 do 1 heartbeat
 Now queue "c" has lower capacity percentage than queue "b", the allocation 
sequence will be "a" -> "c" -> "b",
 queue "c1" has reached queue limit so that requests of app1 should be pending, 
 headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), 
 headroom of queue "c" is <18GB, 18core> (=max-capacity - used), 
 after allocation for queue "c", resource limit of queue "b" will be wrongly 
calculated as <2GB, 2core>,
 headroom of queue "b" will be <1GB, 1core> (=resource-limit - used)
 so that scheduler won't allocate one container for app2 on nm1

  was:
This problem is due to YARN-4280, parent queue will deduct child queue's 
headroom when the child queue reached its resource limit and the skipped type 
is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly 
calculated, but for non-deepest parent queue, its headroom may be much more 
than the sum of reached-limit child queues' headroom, so that the resource 
limit of non-deepest parent may be much less than its true value and block the 
allocation for later queues.

To reproduce this problem with UT:
(1) Cluster has two nodes whose node resource both are <10GB, 10core> and 
3-level queues as below, among them max-capacity of "c1" is 10 and others are 
all 100, so that max-capacity of queue "c1" is <2GB, 2core>
  Root
 /  |\
a   bc
   10   20  70
 |   \
   c1   c2
10(max=10)  90
(2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1
(3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1
(4) app1 and app2 both ask one <2GB, 1core> containers. 
(5) nm1 do 1 heartbeat
Now queue "c" has lower capacity percentage than queue "b", the allocation 
sequence will be "a" -> "c" -> "b",
queue "c1" has reached queue limit so that requests of app1 should be pending, 
headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), 
headroom of queue "c" is <18GB, 18core> (=max-capacity - used), 
after allocation for queue "c", resource limit of queue "b" will be wrongly 
calculated as <2GB, 2core>,
headroom of queue "b" will be <1GB, 1core> (=resource-limit - used)
so that scheduler won't allocate one container for app2 on nm1



> resourceLimits may be wrongly calculated when leaf-queue is blocked in 
> cluster with 3+ level queues
> ---
>
> Key: YARN-8804
> URL: https://issues.apache.org/jira/browse/YARN-8804
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacityscheduler
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Critical
>
> This problem is due to YARN-4280, parent queue will deduct child queue's 
> headroom when the child queue reached its resource limit and the skipped type 
> is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly 
> calculated, but for non-deepest parent queue, its headroom may be much more 
> than the sum of reached-limit child queues' headroom, so that the resource 
> limit of non-deepest parent may be much less than its true value and block 
> the allocation for later queues.
> To reproduce this problem with UT:
>  (1) Cluster has two nodes whose node resource both are <10GB, 10core> and 
> 3-level queues as below, among them max-capacity of "c1" is 10 and others are 
> all 100, so that max-capacity of