[
https://issues.apache.org/jira/browse/TEZ-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608578#comment-15608578
]
Jason Lowe commented on TEZ-3491:
---------------------------------
The test failure appears to be unrelated. Looks like a known issue being
tracked by TEZ-3097, and the test passes for me locally with the patch applied.
> Tez job can hang due to container priority inversion
> ----------------------------------------------------
>
> Key: TEZ-3491
> URL: https://issues.apache.org/jira/browse/TEZ-3491
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Priority: Critical
> Attachments: TEZ-3491.001.patch
>
>
> If the Tez AM receives containers at a lower priority than the highest
> priority task being requested then it fails to assign the container to any
> task. In addition if the container is new then it refuses to release it if
> there are any pending tasks. If it takes too long for the higher priority
> requests to be fulfilled (e.g.: the lower priority containers are filling the
> queue) then eventually YARN will expire the unused lower priority containers
> since they were never launched. The Tez AM then never re-requests these
> lower priority containers and the job hangs because the AM is waiting for
> containers from the RM that the RM already sent and expired.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)