[ 
https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555056#comment-16555056
 ] 

Allan Yang commented on HBASE-20867:
------------------------------------

{quote}
The Master is aborting? Why then retry? Or is it that the Master is aborting 
and the retry may or may not happen... better retry than have the RS do an 
abort?
{quote}
Yes, the master is aborting, but it does not matter, what really matters is 
that if the RPC layer throw a exception(in this case, master is aborting, and 
closed the connection, but there maybe other connection exception too) to 
RSProcedureDispatcher, we shouldn't think there is a problem in RS and abort 
the health RS.

{quote}
Should extend HBaseIOException: 29      public class ConnectionClosedException 
extends IOException
{quote}
Done!

{quote}
else if (exception instanceof ConnectionClosedException) {
181     return (ConnectionClosedException) new ConnectionClosedException(
182     "Call to " + addr + " failed because " + 
exception).initCause(exception);
{quote}
Done!



> RS may get killed while master restarts
> ---------------------------------------
>
>                 Key: HBASE-20867
>                 URL: https://issues.apache.org/jira/browse/HBASE-20867
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 3.0.0, 2.1.0, 2.0.1
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>             Fix For: 3.0.0, 2.0.2, 2.1.1
>
>         Attachments: HBASE-20867.branch-2.0.001.patch, 
> HBASE-20867.branch-2.0.002.patch, HBASE-20867.branch-2.0.003.patch, 
> HBASE-20867.branch-2.0.004.patch, HBASE-20867.branch-2.0.005.patch
>
>
> If the master is dispatching a RPC call to RS when aborting. A connection 
> exception may be thrown by the RPC layer(A IOException with "Connection 
> closed" message in this case). The RSProcedureDispatcher will regard is as an 
> un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, 
> which will expire the RS.
> Actually, the RS is very healthy, only the master is restarting.
> I think we should deal with those kinds of connection exceptions in 
> RSProcedureDispatcher and retry the rpc call



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to