[ 
https://issues.apache.org/jira/browse/HBASE-21384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21384:
------------------------------
    Fix Version/s: 2.0.3
                   2.1.1
                   2.2.0
                   3.0.0

> Procedure with holdlock=false should not be restored lock when restarts 
> ------------------------------------------------------------------------
>
>                 Key: HBASE-21384
>                 URL: https://issues.apache.org/jira/browse/HBASE-21384
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
>         Attachments: HBASE-21384.branch-2.0.001.patch, 
> HBASE-21384.branch-2.0.002.patch
>
>
> Yet another case of stuck similar with HBASE-21364.
> The case is that:
> 1. A ModifyProcedure spawned a ReopenTableProcedure, and since its 
> holdLock=false, so it release the lock
> 2. The  ReopenTableProcedure spawned several MoveRegionProcedure, it also has 
> holdLock=false, but just after it store the children procedures to the wal 
> and begin to release the lock, the master was killed.
> 3. When restarting, the  ReopenTableProcedure's lock was restored (since it 
> was hold the lock before, which is not right, since it is in WAITING state 
> now and its holdLock=false)
> 4. After restart, MoveRegionProcedure can execute since its parent has the 
> lock, but when it spawned the AssignProcedure, the AssignProcedure procedure 
> can't execute anymore, since it parent didn't have the lock, but its 
> 'grandpa' - ReopenTableProcedure  has.
> 5. Restart the master, the stuck still, because we will restore the lock for 
> ReopenTableProcedure.
> Two fixes:
> 1. We should not restore the lock if the procedure doesn't hold lock and in 
> WAITING state.
> 2. Procedures don't have lock but its parent has the lock should also be put 
> in front of the queue, as a addendum of HBASE-21364.
> Discussion:
>  Should we check the lock of all ancestors not only its parents? As addressed 
> in the comments of the patch, currently, after fix the issue above, check 
> parent is enough.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to