[ 
https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700334#comment-14700334
 ] 

Rohini Palaniswamy commented on TEZ-2300:
-----------------------------------------

bq. DAGClient::tryKillDAG()
    Sorry missed the DAGClient API as I was only looking at TezClient API.

bq. Not sure if Pig is using shutdownTezAM() or just calling killApplication on 
YARN.
  We do not killApplication on YARN. We call TezClient.stop() which calls 
proxy.shutdownSession. TezClient.stop() tries to kill via YARN but only if it 
was not able to connect and send shutdown request to Tez AM. Don't think I have 
seen cases which have gone into that condition. 
    Problem is in bad cases like big event queue backlog the shutdown happens 
after 10-15 mins. It should kill via YARN if shutdown does not happen within a 
reasonable amount of time in addition to when not able to connect.

{code}
if (!sessionShutdownSuccessful) {
          LOG.info("Could not connect to AM, killing session via YARN"
              + ", sessionName=" + clientName
              + ", applicationId=" + sessionAppId);
          try {
            frameworkClient.killApplication(sessionAppId);
          } catch (ApplicationNotFoundException e) {
            LOG.info("Failed to kill nonexistent application " + sessionAppId, 
e);
          } catch (YarnException e) {
            throw new TezException(e);
          }
        }
{code}

> TezClient.stop() takes a lot of time or does not work sometimes
> ---------------------------------------------------------------
>
>                 Key: TEZ-2300
>                 URL: https://issues.apache.org/jira/browse/TEZ-2300
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-2300.1.patch, TEZ-2300.2.patch, TEZ-2300.3.patch, 
> TEZ-2300.4.patch, syslog_dag_1428329756093_325099_1_post 
>
>
>   Noticed this with a couple of pig scripts which were not behaving well (AM 
> close to OOM, etc) and even with some that were running fine. Pig calls 
> Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits 
> immediately or is hung. In both cases it either takes a long time for the 
> yarn application to go to KILLED state. Many times I just end up calling yarn 
> application -kill separately after waiting for 5 mins or more for it to get 
> killed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to