Allan Yang created HBASE-21384:
----------------------------------

             Summary: Procedure with holdlock=false should not be restored lock 
when restarts 
                 Key: HBASE-21384
                 URL: https://issues.apache.org/jira/browse/HBASE-21384
             Project: HBase
          Issue Type: Sub-task
            Reporter: Allan Yang
            Assignee: Allan Yang


Yet another case of stuck similar with HBASE-21364.
The case is that:
1. A ModifyProcedure spawned a ReopenTableProcedure, and since its 
holdLock=false, so it release the lock
2. The  ReopenTableProcedure spawned several MoveRegionProcedure, it also has 
holdLock=false, but just after it store the children procedures to the wal and 
begin to release the lock, the master was killed.
3. When restarting, the  ReopenTableProcedure's lock was restored (since it was 
hold the lock before, which is not right, since it is in WAITING state now and 
its holdLock=false)
4. After restart, MoveRegionProcedure can execute since its parent has the 
lock, but when it spawned the AssignProcedure, the AssignProcedure procedure 
can't execute anymore, since it parent didn't have the lock, but its 'grandpa' 
- ReopenTableProcedure  has.
5. Restart the master, the stuck still, because we will restore the lock for 
ReopenTableProcedure.

Two fixes:
1. We should not restore the lock if the procedure doesn't hold lock and in 
WAITING state.
2. Procedures don't have lock but its parent has the lock should also be put in 
front of the queue, as a addendum of HBASE-21364.

Discussion:
 Should we check the lock of all ancestors not only its parents? As addressed 
in the comments of the patch, currently, after fix the issue above, check 
parent is enough.  




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to