[
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704141#comment-14704141
]
Bikas Saha commented on TEZ-2687:
---------------------------------
Typo. And log message should probably now be "initiating stop"
{code}+ public synchronized void initiateStop() {
+ // release held containers
+ LOG.info("Realease held containers");
+ isStopStarted.set(true);{code}
This is probably going to cause concurrent access modification
{code}+ // remove taskRequest from AMRMClient to avoid allocating new
containers in the next heartbeat
+ LOG.info("Remove all the taskRequests");
+ for (Object task : taskRequests.keySet()) {
+ removeTaskRequest(task);
+ }{code}
The test should allocate only 2 containers so that 1 task request is still
pending when initiateStop is called. That way we can also verify that the
pending task requests are removed at the AMRMClient.
> ATS History shutdown happens before the min-held containers are released
> ------------------------------------------------------------------------
>
> Key: TEZ-2687
> URL: https://issues.apache.org/jira/browse/TEZ-2687
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.6.2, 0.8.0, 0.7.1
> Reporter: Gopal V
> Assignee: Jeff Zhang
> Attachments: TEZ-2687-1.patch, TEZ-2687-2.patch
>
>
> When ATS goes into a GC pause under heavy loads and while it recovers, each
> Tez AM holds onto a few containers even though it is shutting down and will
> never accept any more DAGs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)