[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193413#comment-13193413
 ] 

Ravi Prakash commented on MAPREDUCE-3614:
-----------------------------------------

No problem. Thanks for the reply!

{quote}Currently, if SIGKILL the application it shows up as State:FAILED & 
FinalStatus:FAILED.
on SIGTERM, its State:FINISHED & FinalStatus:UNDEFINED.
Is it the other way around ? SIGKILL -> Container may be retried. SIGTERM -> 
Marked as FAILED.{quote}
On a SIGKILL, the AM is indeed retried (if yarn.resourcemanager.am.max-retries 
> 1). If I SIGKILL it every time / yarn.resourcemanager.am.max-retries==1, 
thats when it shows up as FAILED.

{quote}The RM may decide to kill an application - in which case the NodeManager 
kills the container. The NM sends a SIGTERM, and a delayed SIGKILL in case the 
container does not exit.
If the RM requests the kill - we should make sure the App isn't retried.{quote}
Aaah! I did not know about this use case. So a SIGTERM should not lead to 
retries. Cool.

{quote}In the case of a SIGKILL, nothing much can be done (easily). For a 
SIGTERM - like you've mentioned before, the history file can be moved over to 
the correct location - and a useful diagnostic message. (possibly a change to a 
history event).

Other than this scenario - we probably do not need to worry a lot about a 
container receiving a SIGKILL / SIGTERM.{quote}
Agreed. Awesome. 

I've been banging my head against this problem but soon as 
{noformat}scheduler.finishApplicationMaster(request);{noformat} 
is called in RMCommunicator.java, the MRAppMaster and the Client exit with the 
Exception I pasted in Comment #1 . I'm trying to make it not do that. Any help 
will be appreciated. 

                
> finalState UNDEFINED if AM is killed by hand
> --------------------------------------------
>
>                 Key: MAPREDUCE-3614
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3614
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Ravi Prakash
>            Assignee: Ravi Prakash
>         Attachments: MAPREDUCE-3614.branch-0.23.patch
>
>
> Courtesy [~dcapwell]
> {quote}
> If the AM is running and you kill the process (sudo kill #pid), the State in 
> Yarn would be FINISHED and FinalStatus is UNDEFINED.  The Tracking UI would 
> say "History" and point to the proxy url (which will redirect to the history 
> server).
> The state should be more descriptive that the job failed and the tracker url 
> shouldn't point to the history server.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to