[ https://issues.apache.org/jira/browse/TEZ-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474529#comment-16474529 ]
TezQA commented on TEZ-3935: ---------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12923311/TEZ-3935.001.patch against master revision 60645a8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in : org.apache.tez.test.TestAMRecovery Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/2800//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/2800//console This message is automatically generated. > DAG aware scheduler should release unassigned new containers rather than hold > them > ---------------------------------------------------------------------------------- > > Key: TEZ-3935 > URL: https://issues.apache.org/jira/browse/TEZ-3935 > Project: Apache Tez > Issue Type: Bug > Reporter: Jason Lowe > Assignee: Jason Lowe > Priority: Major > Attachments: TEZ-3935.001.patch > > > I saw a case for a very large job with many containers where the DAG aware > scheduler was getting behind on assigning containers. Newly assigned > containers were not finding any matching request, so they were queued for > reuse processing. However it took so long to get through all of the task and > container events that the container allocations expired before the container > was finally assigned and attempted to be launched. > Newly assigned containers are assigned to their matching requests, even if > that violates the DAG priorities, so it should be safe to simply release > these if no tasks could be found to use them. The matching request has > either been removed or already satisified with a reused container. Besides, > if we can't find any tasks to take the newly assigned container then it is > very likely we have plenty of reusable containers already, and keeping more > containers just makes the job a resource hog on the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)