[
https://issues.apache.org/jira/browse/HBASE-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378745#comment-15378745
]
Stephen Yuan Jiang commented on HBASE-16233:
--------------------------------------------
Several approaches were discussed with [~mbertozzi]:
- Solution 1: as long as isSingleSharedLock() is false, we don't call zk to
acquire shared lock, because one shared lock in zk is good enough; same as
released the lock, if isSingleSharedLock() is false, no need to call zk to
release the lock.
Solution 1 looks like a hacked solution, but it should work and the change is
simple (we do need to move the expensive zk lock acquire/release inside the
synchronization block. - Note: we have already used isSingleSharedLock() to
make decision on 'reset' parameter of release lock - I think there is a bug
that acquire/release zk lock not inside synchronization block, because
isSingleSharedLock() could change.)
- Solution 2: make a 'private HashMap<procedureId, TableLock > tableLock' to
replace 'private TableLock tableLock' - now we track all locks. The draw back
is that when exclusive lock is used, a little bit more overhead by looking at
hash table.
Solution 2 is more robust as we creates multiple shared lock znode to track
each procedure. But it is a little complicated and really there is no need to
over-complicate the part of code that might not exist in a long term.
According to [~mbertozzi], [~Abby] is looking to removing the zklock in Apache
HBASE 2.0. He would open a JIRA for the work.
In the mean time, the V1 patch is using the proposal of Solution 1.
> Procedure V2: Support acquire/release shared table lock concurrently
> --------------------------------------------------------------------
>
> Key: HBASE-16233
> URL: https://issues.apache.org/jira/browse/HBASE-16233
> Project: HBase
> Issue Type: Sub-task
> Components: proc-v2
> Reporter: Stephen Yuan Jiang
> Assignee: Stephen Yuan Jiang
> Fix For: 2.0.0
>
> Attachments: HBASE-16233.v1-master.patch
>
>
> {{MasterProcedureScheduler.TableQueue}} class only has one single instance of
> TableLock ({{private TableLock tableLock = null;}}) to track exclusive/shared
> table lock from TableLockManager.
> When multiple shared lock request comes, the later shared lock request would
> overwrite the lock acquired from earlier shared lock request, and hence, we
> will get some weird error when the second or later release lock request
> comes, because we lose track of the lock.
> The issue can be reproduced in the unit test of HBASE-14552. [~mbertozzi]
> also comes up with a UT without using any real procedure to repro the problem:
> {code}
> @Test
> public void testSchedWithZkLock() throws Exception {
> MiniZooKeeperCluster zkCluster = new MiniZooKeeperCluster(conf);
> int zkPort = zkCluster.startup(new File("/tmp/test-zk"));
> Thread.sleep(10000);
> conf.set("hbase.zookeeper.quorum", "localhost:" + zkPort);
> ZooKeeperWatcher zkw = new ZooKeeperWatcher(conf, "testSchedWithZkLock",
> null, false);
> queue = new MasterProcedureScheduler(conf,
> TableLockManager.createTableLockManager(
> conf, zkw, ServerName.valueOf("localhost", 12345, 1)));
> final TableName tableName = TableName.valueOf("testtb");
> TestTableProcedure procA = new TestTableProcedure(1, tableName,
> TableProcedureInterface.TableOperationType.READ);
> TestTableProcedure procB = new TestTableProcedure(2, tableName,
> TableProcedureInterface.TableOperationType.READ);
> assertTrue(queue.tryAcquireTableSharedLock(procA, tableName));
> assertTrue(queue.tryAcquireTableSharedLock(procB, tableName));
> queue.releaseTableSharedLock(procA, tableName);
> queue.releaseTableSharedLock(procB, tableName);
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)