[
https://issues.apache.org/jira/browse/TEZ-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474437#comment-16474437
]
Jason Lowe commented on TEZ-3935:
---------------------------------
Attaching a patch that by default will release new containers if they are not
assigned, but a user can set tez.am.container.reuse.new-containers.enabled=true
to restore the old behavior if their particular job benefits from holding onto
unassigned new containers and the impact on the cluster utilization is not a
concern.
> DAG aware scheduler should release unassigned new containers rather than hold
> them
> ----------------------------------------------------------------------------------
>
> Key: TEZ-3935
> URL: https://issues.apache.org/jira/browse/TEZ-3935
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Priority: Major
> Attachments: TEZ-3935.001.patch
>
>
> I saw a case for a very large job with many containers where the DAG aware
> scheduler was getting behind on assigning containers. Newly assigned
> containers were not finding any matching request, so they were queued for
> reuse processing. However it took so long to get through all of the task and
> container events that the container allocations expired before the container
> was finally assigned and attempted to be launched.
> Newly assigned containers are assigned to their matching requests, even if
> that violates the DAG priorities, so it should be safe to simply release
> these if no tasks could be found to use them. The matching request has
> either been removed or already satisified with a reused container. Besides,
> if we can't find any tasks to take the newly assigned container then it is
> very likely we have plenty of reusable containers already, and keeping more
> containers just makes the job a resource hog on the cluster.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)