[ https://issues.apache.org/jira/browse/MAPREDUCE-5562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786172#comment-13786172 ]
Jason Lowe commented on MAPREDUCE-5562: --------------------------------------- How does this change interact with an RM restart scenario -- will it cause every AM trying to unregister to crash? Seems like the nature of the exception should be relevant when determining if this is a fatal error to the AM. If the error is a read timeout or connection refused then I'm not sure we want the AM to fall over immediately in those cases, especially when work-preserving restart is added to the RM. We certainly don't want clients to do so in the same scenarios. If the error is a bad token or something else that is not going to succeed on a retry then yeah, we should shut down the AM. What if this is the last AM attempt? Do we really want to orphan the staging directory and fail to generate job history in those cases? If we end up deciding System.exit is really the proper thing to do here then it should be using ExitUtil rather than calling System.exit directly. > MR AM should exit when unregister() throws exception > ---------------------------------------------------- > > Key: MAPREDUCE-5562 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5562 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Zhijie Shen > Assignee: Zhijie Shen > Attachments: MAPREDUCE-5562.1.patch, MAPREDUCE-5562.2.patch, > MAPREDUCE-5562.3.patch > > -- This message was sent by Atlassian JIRA (v6.1#6144)