[ 
https://issues.apache.org/jira/browse/TEZ-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608578#comment-15608578
 ] 

Jason Lowe commented on TEZ-3491:
---------------------------------

The test failure appears to be unrelated.  Looks like a known issue being 
tracked by TEZ-3097, and the test passes for me locally with the patch applied.

> Tez job can hang due to container priority inversion
> ----------------------------------------------------
>
>                 Key: TEZ-3491
>                 URL: https://issues.apache.org/jira/browse/TEZ-3491
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.1
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: TEZ-3491.001.patch
>
>
> If the Tez AM receives containers at a lower priority than the highest 
> priority task being requested then it fails to assign the container to any 
> task.  In addition if the container is new then it refuses to release it if 
> there are any pending tasks.  If it takes too long for the higher priority 
> requests to be fulfilled (e.g.: the lower priority containers are filling the 
> queue) then eventually YARN will expire the unused lower priority containers 
> since they were never launched.  The Tez AM then never re-requests these 
> lower priority containers and the job hangs because the AM is waiting for 
> containers from the RM that the RM already sent and expired.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to