[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555056#comment-16555056 ]
Allan Yang commented on HBASE-20867: ------------------------------------ {quote} The Master is aborting? Why then retry? Or is it that the Master is aborting and the retry may or may not happen... better retry than have the RS do an abort? {quote} Yes, the master is aborting, but it does not matter, what really matters is that if the RPC layer throw a exception(in this case, master is aborting, and closed the connection, but there maybe other connection exception too) to RSProcedureDispatcher, we shouldn't think there is a problem in RS and abort the health RS. {quote} Should extend HBaseIOException: 29 public class ConnectionClosedException extends IOException {quote} Done! {quote} else if (exception instanceof ConnectionClosedException) { 181 return (ConnectionClosedException) new ConnectionClosedException( 182 "Call to " + addr + " failed because " + exception).initCause(exception); {quote} Done! > RS may get killed while master restarts > --------------------------------------- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Sub-task > Affects Versions: 3.0.0, 2.1.0, 2.0.1 > Reporter: Allan Yang > Assignee: Allan Yang > Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20867.branch-2.0.001.patch, > HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, > HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)