[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786513#comment-13786513
 ] 

Jason Lowe commented on MAPREDUCE-5562:
---------------------------------------

bq. Since we are using RMProxy, connection exception are handled in RMProxy and 
retried automatically, and we can also define other type of exception in 
RMProxy with different retry policy if needed. For work-preserving restart, AM 
will hang when RM is down and after RM comes up, it should be able to 
unregister successfully.

OK, so if we're having connection-level issues with the RM it sounds like we 
will get some retries at a lower level which is good.  I don't want AMs to 
simply give up just because of an isolated, temporary network connectivity 
issue.

So it sounds like we're left with the staging directory issue.  Can't we 
cleanup the staging directory before leaving if it's the last attempt?

> MR AM should exit when unregister() throws exception
> ----------------------------------------------------
>
>                 Key: MAPREDUCE-5562
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5562
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Zhijie Shen
>            Assignee: Zhijie Shen
>         Attachments: MAPREDUCE-5562.1.patch, MAPREDUCE-5562.2.patch, 
> MAPREDUCE-5562.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to