[
https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700194#comment-14700194
]
Rohini Palaniswamy commented on TEZ-2300:
-----------------------------------------
When a user aborts a Pig script, Pig kills the jobs it launched in the shutdown
hook. What I am looking for is the same behaviour as killing a mapreduce job.
The job should stop whatever it is doing and AM should exit in less than half a
minute.
bq. Are we waiting for the DAG to be finished?
No. We are trying to kill it. It should be interrupted and processing
stopped.
bq. Are we waiting until the AM is closed as well?
Currently the call is not blocking. It should block and exit after the kill
succeeds.
bq. Or is the most important aspect to reduce the amount of time of it takes to
shutdown an AM with a DAG running?
That as well. AM should be terminated after a timeout period if graceful
kill/shutdown does not work similar to mapreduce.
bq. With the pig interactive command line, will pig want to cancel a DAG and
run another in the same AM?
Currently there are no APIs to cancel a DAG and I don't see the need at this
point to cancel a DAG and reuse that AM.
> TezClient.stop() takes a lot of time or does not work sometimes
> ---------------------------------------------------------------
>
> Key: TEZ-2300
> URL: https://issues.apache.org/jira/browse/TEZ-2300
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
> Assignee: Jonathan Eagles
> Attachments: TEZ-2300.1.patch, TEZ-2300.2.patch, TEZ-2300.3.patch,
> TEZ-2300.4.patch, syslog_dag_1428329756093_325099_1_post
>
>
> Noticed this with a couple of pig scripts which were not behaving well (AM
> close to OOM, etc) and even with some that were running fine. Pig calls
> Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits
> immediately or is hung. In both cases it either takes a long time for the
> yarn application to go to KILLED state. Many times I just end up calling yarn
> application -kill separately after waiting for 5 mins or more for it to get
> killed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)