[
https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706168#comment-14706168
]
Jeff Zhang commented on TEZ-2687:
---------------------------------
[~hitesh] I will commit the new patch that without the config
tez.test.history-service.stop.sleep.secs. We can add a new
HistoryLoggingService for system test.
bq. Lastly, the sleep should happen after all ats events are flushed to ATS.
The current sleep is being done before the flush happens which seems incorrect.
Any difference between sleeping before and after ats events flushed to ATS ? Do
you concern about the DAGClient ? I think normally client won't switch to
TimelineClient. We only switch to TimelineClient when app is done can could not
get DAGStatus through AM RPC. And I believe releasing containers should not
depend on whether ats events are flushed to ATS.
> ATS History shutdown happens before the min-held containers are released
> ------------------------------------------------------------------------
>
> Key: TEZ-2687
> URL: https://issues.apache.org/jira/browse/TEZ-2687
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.6.2, 0.8.0, 0.7.1
> Reporter: Gopal V
> Assignee: Jeff Zhang
> Attachments: TEZ-2687-1.patch, TEZ-2687-2.patch, TEZ-2687-3.patch,
> TEZ-2687-4.patch, TEZ-2687-6.patch, TEZ-2687-7.patch
>
>
> When ATS goes into a GC pause under heavy loads and while it recovers, each
> Tez AM holds onto a few containers even though it is shutting down and will
> never accept any more DAGs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)