[
https://issues.apache.org/jira/browse/HBASE-21364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Allan Yang resolved HBASE-21364.
--------------------------------
Resolution: Fixed
Fix Version/s: 2.2.0
3.0.0
> Procedure holds the lock should put to front of the queue after restart
> -----------------------------------------------------------------------
>
> Key: HBASE-21364
> URL: https://issues.apache.org/jira/browse/HBASE-21364
> Project: HBase
> Issue Type: Sub-task
> Affects Versions: 2.1.0, 2.0.2
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Blocker
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21364.branch-2.0.001.patch,
> HBASE-21364.branch-2.0.002.patch
>
>
> After restore the procedures form Procedure WALs. We will put the runable
> procedures back to the queue to execute. The order is not the problem before
> HBASE-20846 since the first one to execute will acquire the lock itself. But
> since the locks will restored after HBASE-20846. If we execute a procedure
> without the lock first before a procedure with the lock in the same queue,
> there is a race condition that we may not be able to execute all procedures
> in the same queue at all.
> The race condtion is:
> 1. A procedure need to take the table's exclusive lock was put into the
> table's queue, but the table's shard lock was lock by a Region Procedure.
> Since no one takes the exclusive lock, the queue is put to run queue to
> execute. But soon, the worker thread see the procedure can't execute because
> it doesn't hold the lock, so it will stop execute and remove the queue from
> run queue.
> 2. At the same time, the Region procedure which holds the table's shard lock
> and the region's exclusive lock is put to the table's queue. But, since the
> queue already added to the run queue, it won't add again.
> 3. Since 1, the table's queue was removed from the run queue.
> 4. Then, no one will put the table's queue back, thus no worker will execute
> the procedures inside
> A test case in the patch shows how.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)