Himanshu Mishra created TEZ-4580: ------------------------------------ Summary: Slow preemption of new containers when re-use is enabled Key: TEZ-4580 URL: https://issues.apache.org/jira/browse/TEZ-4580 Project: Apache Tez Issue Type: Improvement Reporter: Himanshu Mishra Assignee: Himanshu Mishra
I __ observed intermittent high runtime of a TPCDS query with running with YARN async scheduler `{_}yarn.scheduler.capacity.schedule-asynchronously.enable=true`.{_} I found that preemption of lower priority containers was taking very long time in such cases. Tez AM log had warning {{{}Expected delayed containers to be empty.{}}}, followed by another {{Held container expected to be not null for a non-AM-released container}} and after this only 1 container was getting released, even when {{tez.am.preemption.percentage}} is high. Further investigation lead to following conclusion: 1. [Warn log / Assertion error|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1335] thrown because in [preemptIfNeeded()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1314], when releasing new containers, the loop counter is being decremented with each `{{{}releaseUnassignedContainers{}}}`, leading to looping only half number of times. By using another counter, assertion passes because of condition method returns with check `{{{}if (numPendingRequestsToService < 1) {{}}}`. 2. In [releaseContainer()|https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L1566], the container is not getting removed from `{{{}delayedContainers{}}}` queue and only from `{{heldContainers}}` map, hence same container is being picked up for release in every iteration till next cycle of `{{{}DelayedContainerManager{}}}` finds out that the container is not in `{{{}heldContainers{}}}` and skips it [with log |https://github.com/apache/tez/blob/master/tez-dag/src/main/java/org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java#L2095]`{{{}Skipping delayed container as container is no longer running, containerId=...{}}}` -- This message was sent by Atlassian Jira (v8.20.10#820010)