[ 
https://issues.apache.org/jira/browse/HBASE-20634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20634:
--------------------------
    Status: Patch Available  (was: Open)

.001 See below.

{code}
    HBASE-20634 Reopen region while server crash can cause the procedure to be 
stuck

    A reattempt at fixing HBASE-20173 [AMv2] DisableTableProcedure concurrent 
to ServerCrashProcedure can deadlock

    The scenario is a SCP after processing WALs, goes to assign regions that
    were on the crashed server but a concurrent Procedure gets in there
    first and tries to unassign a region that was on the crashed server
    (could be part of a move procedure or a disable table, etc.). The
    unassign happens to run AFTER SCP has released all RPCs that
    were going against the crashed server. The unassign fails because the
    server is crashed. The unassign used to suspend itself only it would
    never be woken up because the server it was going against had already
    been processed. Worse, the SCP could not make progress because the
    unassign was suspended with the lock on a region that it wanted to
    assign held making it so it could make no progress.

    In here, we add to the unassign recognition of the state where it is
    running post SCP cleanup of RPCs. If present, unassign moves to finish
    instead of suspending itself.

    Includes a nice unit test made by Duo Zhang that reproduces nicely the
    hung scenario.

    M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/FailedRemoteDispatchException.java
     Moved this class back to hbase-procedure where it belongs.

    M 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NoNodeDispatchException.java
    M 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NoServerDispatchException.java
    M 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/NullTargetServerDispatchException.java
     Specializiations on FRDE so we can be more particular when we say there
     was a problem.

    M 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/RemoteProcedureDispatcher.java
     Change addOperationToNode so we throw exceptions that give more detail
     on issue rather than a mysterious true/false

    M hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto
     Undo SERVER_CRASH_HANDLE_RIT2. Bad idea (from HBASE-20173)

    M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
     Have expireServer return true if it actually queued an expiration. Used
     later in this patch.

    M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java
     Hide methods that shouldn't be public. Add a particular check used out
     in unassign procedure failure processing.

    M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/MoveRegionProcedure.java
     Check that server we're to move from is actually online (might
     catch a few silly move requests early).

    M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java
     Add doc on ServerState. Wasn't being used really. Now we actually stamp
     a Server OFFLINE after its WAL has been split. Means its safe to assign
     since all WALs have been processed. Add methods to update SPLITTING
     and to set it to OFFLINE after splitting done.

    M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionTransitionProcedure.java
     Change logging to be new-style and less repetitive of info.
     Cater to new way in which .addOperationToNode returns info (exceptions
     rather than true/false).

    M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/UnassignProcedure.java
     Add looking for the case where we failed assign AND we should not
     suspend because we will never be woken up because SCP is beyond
     doing this for all stuck RPCs.

     Some cleanup of the failure processing grouping where we can proceed.

     TODOs have been handled in this refactor including the TODO that
     wonders if it possible that there are concurrent fails coming in
     (Yes).

    M 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java
     Doc and removing the old HBASE-20173 'fix'.
     Also updating ServerStateNode post WAL splitting so it gets marked
     OFFLINE.

    A 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestServerCrashProcedureStuck.java
     Nice test by Duo Zhang.
{code}

> Reopen region while server crash can cause the procedure to be stuck
> --------------------------------------------------------------------
>
>                 Key: HBASE-20634
>                 URL: https://issues.apache.org/jira/browse/HBASE-20634
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Duo Zhang
>            Assignee: stack
>            Priority: Critical
>             Fix For: 3.0.0, 2.1.0, 2.0.1
>
>         Attachments: HBASE-20634-UT.patch, HBASE-20634.branch-2.0.001.patch
>
>
> Found this when implementing HBASE-20424, where we will transit the peer sync 
> replication state while there is server crash.
> The problem is that, in ServerCrashAssign, we do not have the region lock, so 
> it is possible that after we call handleRIT to clear the existing 
> assign/unassign procedures related to this rs, and before we schedule the 
> assign procedures, it is possible that that we schedule a unassign procedure 
> for a region on the crashed rs. This procedure will not receive the 
> ServerCrashException, instead, in addToRemoteDispatcher, it will find that it 
> can not dispatch the remote call and then a  FailedRemoteDispatchException 
> will be raised. But we do not treat this exception the same with 
> ServerCrashException, instead, we will try to expire the rs. Obviously the rs 
> has already been marked as expired, so this is almost a no-op. Then the 
> procedure will be stuck there for ever.
> A possible way to fix it is to treat FailedRemoteDispatchException the same 
> with ServerCrashException, as it will be created in addToRemoteDispatcher 
> only, and the only reason we can not dispatch a remote call is that the rs 
> has already been dead. The nodeMap is a ConcurrentMap so I think we could use 
> it as a guard.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to