[ 
https://issues.apache.org/jira/browse/HBASE-21508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16698076#comment-16698076
 ] 

Duo Zhang commented on HBASE-21508:
-----------------------------------

The problem is that, we hold lock when doing reportRegionStateTransition, so if 
we are closing meta, and also closing another region from the same server, then 
there will be dead lock, that the reportRegionStateTransition for meta is block 
by another region, but the reportRegionStateTransition for this region can not 
be finished since meta is not online.

So I change to use a ReadWriteLock instead. In reportRegionStateTransition, we 
will use read lock, and in submitServerCrash, we will use write lock.

> Ignore the reportRegionStateTransition call from a dead server
> --------------------------------------------------------------
>
>                 Key: HBASE-21508
>                 URL: https://issues.apache.org/jira/browse/HBASE-21508
>             Project: HBase
>          Issue Type: Sub-task
>          Components: amv2
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0
>
>         Attachments: HBASE-21508-v1.patch, HBASE-21508-v2.patch, 
> HBASE-21508.patch
>
>
> In our ITBLL test we observer a race between the SCP and TRSP which causes a 
> region being assigned to two region servers.
> Not fully understand the scenario, but anyway, the we do not consider the 
> situation in the old code, that after SCP gets the region list of a dead 
> server, there could still be a reportRegionStateTransition call from dead 
> server and mess up things.
> In general, I think we should have a fence in the AssignmentManager to 
> prevent the reportRegionStateTransition from the dead servers to mess up the 
> states.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to