[
https://issues.apache.org/jira/browse/TEZ-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605652#comment-15605652
]
Jason Lowe commented on TEZ-3491:
---------------------------------
When the containers expire the Tez AM emits logs like this:
{noformat}
2016-10-23 01:36:49,207 [INFO] [AMRM Callback Handler Thread]
|rm.YarnTaskSchedulerService|: Ignoring unknown container:
container_e08_1475789370361_492567_01_000166
2016-10-23 01:36:49,292 [INFO] [DelayedContainerManager]
|rm.YarnTaskSchedulerService|: Skipping delayed container as container is no
longer running, containerId=container_e08_1475789370361_492567_01_000166
{noformat}
I can see a couple of approaches to fix this:
1) Release the lower priority container to make sure we free up enough space to
allocate the necessary high-priority containers to satisfy the top priority
requests. These released containers need to be re-requested if there are still
pending requests at the container's priority.
2) Allow the lower priority container to be used by a lower priority task. We
risk a similar priority inversion problem here if the lower priority task ends
up waiting for the higher priority task to complete and needs to free up its
resources for that to happen (e.g.: reducer waiting for upstream task but queue
is full). However the existing preemption logic should cover this scenario
since it can happen anyway (via fetch-failed task re-runs).
I'm slightly leaning towards option 2) since there are many cases where the
lower priority task can complete on its own (i.e.: has no dependencies on the
pending higher-priority tasks), and we have an allocation in hand to start
working on that task.
Note another related problem that should be addressed is when we lose
containers due to expiration. Currently if any container allocation expires
the Tez AM is going to drop it without re-requesting it. This is going to
either lead to reduced performance if container reuse allows the AM to funnel
the tasks through fewer containers or an outright hang if it cannot reuse other
containers.
> Tez job can hang due to container priority inversion
> ----------------------------------------------------
>
> Key: TEZ-3491
> URL: https://issues.apache.org/jira/browse/TEZ-3491
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jason Lowe
> Priority: Critical
>
> If the Tez AM receives containers at a lower priority than the highest
> priority task being requested then it fails to assign the container to any
> task. In addition if the container is new then it refuses to release it if
> there are any pending tasks. If it takes too long for the higher priority
> requests to be fulfilled (e.g.: the lower priority containers are filling the
> queue) then eventually YARN will expire the unused lower priority containers
> since they were never launched. The Tez AM then never re-requests these
> lower priority containers and the job hangs because the AM is waiting for
> containers from the RM that the RM already sent and expired.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)