[jira] [Updated] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-07-22 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10293:
-
Parent: YARN-5139
Issue Type: Sub-task  (was: Bug)

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement (YARN-10259)
> 
>
> Key: YARN-10293
> URL: https://issues.apache.org/jira/browse/YARN-10293
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: YARN-10293-001.patch, YARN-10293-002.patch, 
> YARN-10293-003-WIP.patch, YARN-10293-004.patch, YARN-10293-005.patch
>
>
> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues 
> related to it 
> https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987
> Have found one more bug in the CapacityScheduler.java code which causes the 
> same issue with slight difference in the repro.
> *Repro:*
> *Nodes :   Available : Used*
> Node1 -  8GB, 8vcores -  8GB. 8cores
> Node2 -  8GB, 8vcores - 8GB. 8cores
> Node3 -  8GB, 8vcores - 8GB. 8cores
> Queues -> A and B both 50% capacity, 100% max capacity
> MultiNode enabled + Preemption enabled
> 1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores
> 2. JobB Submitted to B queue with AM size of 1GB
> {code}
> 2020-05-21 12:12:27,313 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
> IP=172.27.160.139   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005  
>   CALLERCONTEXT=CLI   QUEUENAME=dummy
> {code}
> 3. Preemption happens and used capacity is lesser than 1.0f
> {code}
> 2020-05-21 12:12:48,222 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
>  Non-AM container preempted, current 
> appAttemptId=appattempt_1590046667304_0004_01, 
> containerId=container_e09_1590046667304_0004_01_24, 
> resource=
> {code}
> 4. JobB gets a Reserved Container as part of 
> CapacityScheduler#allocateOrReserveNewContainer
> {code}
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
> RESERVED
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> {code}
> *Why RegularContainerAllocator reserved the container when the used capacity 
> is <= 1.0f ?*
> {code}
> The reason is even though the container is preempted - nodemanager has to 
> stop the container and heartbeat and update the available and unallocated 
> resources to ResourceManager.
> {code}
> 5. Now, no new allocation happens and reserved container stays at reserved.
> After reservation the used capacity becomes 1.0f, below will be in a loop and 
> no new allocate or reserve happens. The reserved container cannot be 
> allocated as reserved node does not have space. node2 has space for 1GB, 
> 1vcore but CapacityScheduler#allocateOrReserveNewContainers not getting 
> called causing the Hang.
> *[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
> CapacityScheduler#allocateFromReservedContainer -> Re-reserve the container 
> on node*
> {code}
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1590046667304_0005 
> on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignContainers: partition= #applications=1
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> 2020-05-21 12:13:33,243 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Allocation proposal accepted
> {code}
> CapacityScheduler#allocateOrReserveNewContainers won't be 

[jira] [Updated] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-06-08 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10293:
-
Attachment: YARN-10293-005.patch

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement (YARN-10259)
> 
>
> Key: YARN-10293
> URL: https://issues.apache.org/jira/browse/YARN-10293
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-10293-001.patch, YARN-10293-002.patch, 
> YARN-10293-003-WIP.patch, YARN-10293-004.patch, YARN-10293-005.patch
>
>
> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues 
> related to it 
> https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987
> Have found one more bug in the CapacityScheduler.java code which causes the 
> same issue with slight difference in the repro.
> *Repro:*
> *Nodes :   Available : Used*
> Node1 -  8GB, 8vcores -  8GB. 8cores
> Node2 -  8GB, 8vcores - 8GB. 8cores
> Node3 -  8GB, 8vcores - 8GB. 8cores
> Queues -> A and B both 50% capacity, 100% max capacity
> MultiNode enabled + Preemption enabled
> 1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores
> 2. JobB Submitted to B queue with AM size of 1GB
> {code}
> 2020-05-21 12:12:27,313 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
> IP=172.27.160.139   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005  
>   CALLERCONTEXT=CLI   QUEUENAME=dummy
> {code}
> 3. Preemption happens and used capacity is lesser than 1.0f
> {code}
> 2020-05-21 12:12:48,222 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
>  Non-AM container preempted, current 
> appAttemptId=appattempt_1590046667304_0004_01, 
> containerId=container_e09_1590046667304_0004_01_24, 
> resource=
> {code}
> 4. JobB gets a Reserved Container as part of 
> CapacityScheduler#allocateOrReserveNewContainer
> {code}
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
> RESERVED
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> {code}
> *Why RegularContainerAllocator reserved the container when the used capacity 
> is <= 1.0f ?*
> {code}
> The reason is even though the container is preempted - nodemanager has to 
> stop the container and heartbeat and update the available and unallocated 
> resources to ResourceManager.
> {code}
> 5. Now, no new allocation happens and reserved container stays at reserved.
> After reservation the used capacity becomes 1.0f, below will be in a loop and 
> no new allocate or reserve happens. The reserved container cannot be 
> allocated as reserved node does not have space. node2 has space for 1GB, 
> 1vcore but CapacityScheduler#allocateOrReserveNewContainers not getting 
> called causing the Hang.
> *[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
> CapacityScheduler#allocateFromReservedContainer -> Re-reserve the container 
> on node*
> {code}
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1590046667304_0005 
> on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignContainers: partition= #applications=1
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> 2020-05-21 12:13:33,243 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Allocation proposal accepted
> {code}
> CapacityScheduler#allocateOrReserveNewContainers won't be called as below 
> check in allocateContainersOnMultiNodes 

[jira] [Updated] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-06-05 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10293:
-
Attachment: YARN-10293-004.patch

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement (YARN-10259)
> 
>
> Key: YARN-10293
> URL: https://issues.apache.org/jira/browse/YARN-10293
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-10293-001.patch, YARN-10293-002.patch, 
> YARN-10293-003-WIP.patch, YARN-10293-004.patch
>
>
> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues 
> related to it 
> https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987
> Have found one more bug in the CapacityScheduler.java code which causes the 
> same issue with slight difference in the repro.
> *Repro:*
> *Nodes :   Available : Used*
> Node1 -  8GB, 8vcores -  8GB. 8cores
> Node2 -  8GB, 8vcores - 8GB. 8cores
> Node3 -  8GB, 8vcores - 8GB. 8cores
> Queues -> A and B both 50% capacity, 100% max capacity
> MultiNode enabled + Preemption enabled
> 1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores
> 2. JobB Submitted to B queue with AM size of 1GB
> {code}
> 2020-05-21 12:12:27,313 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
> IP=172.27.160.139   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005  
>   CALLERCONTEXT=CLI   QUEUENAME=dummy
> {code}
> 3. Preemption happens and used capacity is lesser than 1.0f
> {code}
> 2020-05-21 12:12:48,222 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
>  Non-AM container preempted, current 
> appAttemptId=appattempt_1590046667304_0004_01, 
> containerId=container_e09_1590046667304_0004_01_24, 
> resource=
> {code}
> 4. JobB gets a Reserved Container as part of 
> CapacityScheduler#allocateOrReserveNewContainer
> {code}
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
> RESERVED
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> {code}
> *Why RegularContainerAllocator reserved the container when the used capacity 
> is <= 1.0f ?*
> {code}
> The reason is even though the container is preempted - nodemanager has to 
> stop the container and heartbeat and update the available and unallocated 
> resources to ResourceManager.
> {code}
> 5. Now, no new allocation happens and reserved container stays at reserved.
> After reservation the used capacity becomes 1.0f, below will be in a loop and 
> no new allocate or reserve happens. The reserved container cannot be 
> allocated as reserved node does not have space. node2 has space for 1GB, 
> 1vcore but CapacityScheduler#allocateOrReserveNewContainers not getting 
> called causing the Hang.
> *[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
> CapacityScheduler#allocateFromReservedContainer -> Re-reserve the container 
> on node*
> {code}
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1590046667304_0005 
> on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignContainers: partition= #applications=1
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> 2020-05-21 12:13:33,243 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Allocation proposal accepted
> {code}
> CapacityScheduler#allocateOrReserveNewContainers won't be called as below 
> check in allocateContainersOnMultiNodes fails
> {code}
>  if 

[jira] [Updated] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-06-02 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10293:
-
Attachment: YARN-10293-003-WIP.patch

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement (YARN-10259)
> 
>
> Key: YARN-10293
> URL: https://issues.apache.org/jira/browse/YARN-10293
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-10293-001.patch, YARN-10293-002.patch, 
> YARN-10293-003-WIP.patch
>
>
> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues 
> related to it 
> https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987
> Have found one more bug in the CapacityScheduler.java code which causes the 
> same issue with slight difference in the repro.
> *Repro:*
> *Nodes :   Available : Used*
> Node1 -  8GB, 8vcores -  8GB. 8cores
> Node2 -  8GB, 8vcores - 8GB. 8cores
> Node3 -  8GB, 8vcores - 8GB. 8cores
> Queues -> A and B both 50% capacity, 100% max capacity
> MultiNode enabled + Preemption enabled
> 1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores
> 2. JobB Submitted to B queue with AM size of 1GB
> {code}
> 2020-05-21 12:12:27,313 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
> IP=172.27.160.139   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005  
>   CALLERCONTEXT=CLI   QUEUENAME=dummy
> {code}
> 3. Preemption happens and used capacity is lesser than 1.0f
> {code}
> 2020-05-21 12:12:48,222 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
>  Non-AM container preempted, current 
> appAttemptId=appattempt_1590046667304_0004_01, 
> containerId=container_e09_1590046667304_0004_01_24, 
> resource=
> {code}
> 4. JobB gets a Reserved Container as part of 
> CapacityScheduler#allocateOrReserveNewContainer
> {code}
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
> RESERVED
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> {code}
> *Why RegularContainerAllocator reserved the container when the used capacity 
> is <= 1.0f ?*
> {code}
> The reason is even though the container is preempted - nodemanager has to 
> stop the container and heartbeat and update the available and unallocated 
> resources to ResourceManager.
> {code}
> 5. Now, no new allocation happens and reserved container stays at reserved.
> After reservation the used capacity becomes 1.0f, below will be in a loop and 
> no new allocate or reserve happens. The reserved container cannot be 
> allocated as reserved node does not have space. node2 has space for 1GB, 
> 1vcore but CapacityScheduler#allocateOrReserveNewContainers not getting 
> called causing the Hang.
> *[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
> CapacityScheduler#allocateFromReservedContainer -> Re-reserve the container 
> on node*
> {code}
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1590046667304_0005 
> on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignContainers: partition= #applications=1
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> 2020-05-21 12:13:33,243 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Allocation proposal accepted
> {code}
> CapacityScheduler#allocateOrReserveNewContainers won't be called as below 
> check in allocateContainersOnMultiNodes fails
> {code}
>  if 

[jira] [Updated] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-05-29 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10293:
-
Attachment: YARN-10293-002.patch

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement (YARN-10259)
> 
>
> Key: YARN-10293
> URL: https://issues.apache.org/jira/browse/YARN-10293
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-10293-001.patch, YARN-10293-002.patch
>
>
> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues 
> related to it 
> https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987
> Have found one more bug in the CapacityScheduler.java code which causes the 
> same issue with slight difference in the repro.
> *Repro:*
> *Nodes :   Available : Used*
> Node1 -  8GB, 8vcores -  8GB. 8cores
> Node2 -  8GB, 8vcores - 8GB. 8cores
> Node3 -  8GB, 8vcores - 8GB. 8cores
> Queues -> A and B both 50% capacity, 100% max capacity
> MultiNode enabled + Preemption enabled
> 1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores
> 2. JobB Submitted to B queue with AM size of 1GB
> {code}
> 2020-05-21 12:12:27,313 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
> IP=172.27.160.139   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005  
>   CALLERCONTEXT=CLI   QUEUENAME=dummy
> {code}
> 3. Preemption happens and used capacity is lesser than 1.0f
> {code}
> 2020-05-21 12:12:48,222 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
>  Non-AM container preempted, current 
> appAttemptId=appattempt_1590046667304_0004_01, 
> containerId=container_e09_1590046667304_0004_01_24, 
> resource=
> {code}
> 4. JobB gets a Reserved Container as part of 
> CapacityScheduler#allocateOrReserveNewContainer
> {code}
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
> RESERVED
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> {code}
> *Why RegularContainerAllocator reserved the container when the used capacity 
> is <= 1.0f ?*
> {code}
> The reason is even though the container is preempted - nodemanager has to 
> stop the container and heartbeat and update the available and unallocated 
> resources to ResourceManager.
> {code}
> 5. Now, no new allocation happens and reserved container stays at reserved.
> After reservation the used capacity becomes 1.0f, below will be in a loop and 
> no new allocate or reserve happens. The reserved container cannot be 
> allocated as reserved node does not have space. node2 has space for 1GB, 
> 1vcore but CapacityScheduler#allocateOrReserveNewContainers not getting 
> called causing the Hang.
> *[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
> CapacityScheduler#allocateFromReservedContainer -> Re-reserve the container 
> on node*
> {code}
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1590046667304_0005 
> on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignContainers: partition= #applications=1
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> 2020-05-21 12:13:33,243 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Allocation proposal accepted
> {code}
> CapacityScheduler#allocateOrReserveNewContainers won't be called as below 
> check in allocateContainersOnMultiNodes fails
> {code}
>  if (getRootQueue().getQueueCapacities().getUsedCapacity(

[jira] [Updated] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-05-28 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10293:
-
Description: 
Reserved Containers not allocated from available space of other nodes in 
CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues related 
to it 
https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987

Have found one more bug in the CapacityScheduler.java code which causes the 
same issue with slight difference in the repro.

*Repro:*

*Nodes :   Available : Used*
Node1 -  8GB, 8vcores -  8GB. 8cores
Node2 -  8GB, 8vcores - 8GB. 8cores
Node3 -  8GB, 8vcores - 8GB. 8cores

Queues -> A and B both 50% capacity, 100% max capacity

MultiNode enabled + Preemption enabled

1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores

2. JobB Submitted to B queue with AM size of 1GB

{code}
2020-05-21 12:12:27,313 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
IP=172.27.160.139   OPERATION=Submit Application Request
TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005
CALLERCONTEXT=CLI   QUEUENAME=dummy
{code}

3. Preemption happens and used capacity is lesser than 1.0f

{code}
2020-05-21 12:12:48,222 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
 Non-AM container preempted, current 
appAttemptId=appattempt_1590046667304_0004_01, 
containerId=container_e09_1590046667304_0004_01_24, resource=
{code}

4. JobB gets a Reserved Container as part of 
CapacityScheduler#allocateOrReserveNewContainer

{code}
2020-05-21 12:12:48,226 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
RESERVED
2020-05-21 12:12:48,226 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
 Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
available= used= with 
resource=
{code}

*Why RegularContainerAllocator reserved the container when the used capacity is 
<= 1.0f ?*

{code}
The reason is even though the container is preempted - nodemanager has to stop 
the container and heartbeat and update the available and unallocated resources 
to ResourceManager.
{code}

5. Now, no new allocation happens and reserved container stays at reserved.

After reservation the used capacity becomes 1.0f, below will be in a loop and 
no new allocate or reserve happens. The reserved container cannot be allocated 
as reserved node does not have space. node2 has space for 1GB, 1vcore but 
CapacityScheduler#allocateOrReserveNewContainers not getting called causing the 
Hang.


*[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
CapacityScheduler#allocateFromReservedContainer -> Re-reserve the container on 
node*

{code}
2020-05-21 12:13:33,242 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Trying to fulfill reservation for application application_1590046667304_0005 
on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
2020-05-21 12:13:33,242 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
assignContainers: partition= #applications=1
2020-05-21 12:13:33,242 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
 Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
available= used= with 
resource=
2020-05-21 12:13:33,243 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Allocation proposal accepted
{code}

CapacityScheduler#allocateOrReserveNewContainers won't be called as below check 
in allocateContainersOnMultiNodes fails

{code}
 if (getRootQueue().getQueueCapacities().getUsedCapacity(
candidates.getPartition()) >= 1.0f
&& preemptionManager.getKillableResource(
{code}




  was:
Reserved Containers not allocated from available space of other nodes in 
CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues related 
to it 
https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987

Have found one more bug in the CapacityScheduler.java code which causes the 
same issue with slight difference in the repro.

*Repro:*

*Nodes :   Available : Used*
Node1 -  8GB, 8vcores -  8GB. 8cores
Node2 -  8GB, 8vcores - 8GB. 8cores
Node3 -  8GB, 8vcores - 8GB. 8cores

Queues -> A and B both 50% capacity, 100% max capacity

MultiNode enabled + 

[jira] [Updated] (YARN-10293) Reserved Containers not allocated from available space of other nodes in CandidateNodeSet in MultiNodePlacement (YARN-10259)

2020-05-28 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-10293:
-
Attachment: YARN-10293-001.patch

> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement (YARN-10259)
> 
>
> Key: YARN-10293
> URL: https://issues.apache.org/jira/browse/YARN-10293
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-10293-001.patch
>
>
> Reserved Containers not allocated from available space of other nodes in 
> CandidateNodeSet in MultiNodePlacement. YARN-10259 has fixed two issues 
> related to it 
> https://issues.apache.org/jira/browse/YARN-10259?focusedCommentId=17105987=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17105987
> Have found one more bug in the CapacityScheduler.java code which causes the 
> same issue with slight difference in the repro.
> *Repro:*
> *Nodes :   Available : Used*
> Node1 -  8GB, 8vcores -  8GB. 8cores
> Node2 -  8GB, 8vcores - 8GB. 8cores
> Node3 -  8GB, 8vcores - 8GB. 8cores
> Queues -> A and B both 50% capacity, 100% max capacity
> MultiNode enabled + Preemption enabled
> 1. JobA submitted to A queue and which used full cluster 24GB and 24 vcores
> 2. JobB Submitted to B queue with AM size of 1GB
> {code}
> 2020-05-21 12:12:27,313 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=systest  
> IP=172.27.160.139   OPERATION=Submit Application Request
> TARGET=ClientRMService  RESULT=SUCCESS  APPID=application_1590046667304_0005  
>   CALLERCONTEXT=CLI   QUEUENAME=dummy
> {code}
> 3. Preemption happens and used capacity is lesser than 1.0f
> {code}
> 2020-05-21 12:12:48,222 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics:
>  Non-AM container preempted, current 
> appAttemptId=appattempt_1590046667304_0004_01, 
> containerId=container_e09_1590046667304_0004_01_24, 
> resource=
> {code}
> 4. JobB gets a Reserved Container as part of 
> CapacityScheduler#allocateOrReserveNewContainer
> {code}
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e09_1590046667304_0005_01_01 Container Transitioned from NEW to 
> RESERVED
> 2020-05-21 12:12:48,226 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> {code}
> *Why RegularContainerAllocator reserved the container when the used capacity 
> is <= 1.0f ?*
> {code}
> The reason is even though the container is preempted - nodemanager has to 
> stop the container and heartbeat and update the available and unallocated 
> resources to ResourceManager.
> {code}
> 5. Now, no new allocation happens and reserved container stays at reserved.
> After reservation the used capacity becomes 1.0f, below will be in a loop and 
> no new allocate or reserve happens. The reserved container cannot be 
> allocated as reserved node does not have space. node2 has space for 1GB, 
> 1vcore but CapacityScheduler#allocateOrReserveNewContainers not getting 
> called causing the Hang.
> *[INFINITE LOOP] CapacityScheduler#allocateContainersOnMultiNodes -> 
> CapacityScheduler#allocateContainersOnMultiNodes#allocateFromReservedContainer
>  -> Node3 has reserved container*
> {code}
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Trying to fulfill reservation for application application_1590046667304_0005 
> on node: tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
> assignContainers: partition= #applications=1
> 2020-05-21 12:13:33,242 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
>  Reserved container=container_e09_1590046667304_0005_01_01, on node=host: 
> tajmera-fullnodes-3.tajmera-fullnodes.root.hwx.site:8041 #containers=8 
> available= used= with 
> resource=
> 2020-05-21 12:13:33,243 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Allocation proposal accepted
> {code}
> CapacityScheduler#allocateOrReserveNewContainers won't be called as below 
> check in allocateContainersOnMultiNodes fails
> {code}
>  if