[ https://issues.apache.org/jira/browse/TEZ-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056701#comment-14056701 ]
Bikas Saha commented on TEZ-1269: --------------------------------- What we can do here in addition to removing the session check 1) Add a notion of a min container held that can be used to assure some capacity. This can be the minimum of a configured value and some %age of the headroom for this job (as a proxy of queue capacity). Current behavior is when the config is set to 0. 2) Use a degrade factor so that containers are released in timeout + rand(0 to degrade-time). What this will do is prevent a cliff of containers being released. Current behavior is when degrade-time is 0. > TaskScheduler prematurely releases containers > --------------------------------------------- > > Key: TEZ-1269 > URL: https://issues.apache.org/jira/browse/TEZ-1269 > Project: Apache Tez > Issue Type: Bug > Reporter: Bikas Saha > Assignee: Bikas Saha > > It checks for session mode and if not true, and if there are no outstanding > requests, then it releases the containers before the container timeout has > expired. If the state machine is on its way to scheduling new tasks during > this time then they will not be able to reuse these containers. -- This message was sent by Atlassian JIRA (v6.2#6252)