Allan Yang created HBASE-21384:
----------------------------------
Summary: Procedure with holdlock=false should not be restored lock
when restarts
Key: HBASE-21384
URL: https://issues.apache.org/jira/browse/HBASE-21384
Project: HBase
Issue Type: Sub-task
Reporter: Allan Yang
Assignee: Allan Yang
Yet another case of stuck similar with HBASE-21364.
The case is that:
1. A ModifyProcedure spawned a ReopenTableProcedure, and since its
holdLock=false, so it release the lock
2. The ReopenTableProcedure spawned several MoveRegionProcedure, it also has
holdLock=false, but just after it store the children procedures to the wal and
begin to release the lock, the master was killed.
3. When restarting, the ReopenTableProcedure's lock was restored (since it was
hold the lock before, which is not right, since it is in WAITING state now and
its holdLock=false)
4. After restart, MoveRegionProcedure can execute since its parent has the
lock, but when it spawned the AssignProcedure, the AssignProcedure procedure
can't execute anymore, since it parent didn't have the lock, but its 'grandpa'
- ReopenTableProcedure has.
5. Restart the master, the stuck still, because we will restore the lock for
ReopenTableProcedure.
Two fixes:
1. We should not restore the lock if the procedure doesn't hold lock and in
WAITING state.
2. Procedures don't have lock but its parent has the lock should also be put in
front of the queue, as a addendum of HBASE-21364.
Discussion:
Should we check the lock of all ancestors not only its parents? As addressed
in the comments of the patch, currently, after fix the issue above, check
parent is enough.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)