[ 
https://issues.apache.org/jira/browse/HBASE-16233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378745#comment-15378745
 ] 

Stephen Yuan Jiang commented on HBASE-16233:
--------------------------------------------

Several approaches were discussed with [~mbertozzi]:

- Solution 1: as long as isSingleSharedLock() is false, we don't call zk to 
acquire shared lock, because one shared lock in zk is good enough; same as 
released the lock, if isSingleSharedLock() is false, no need to call zk to 
release the lock.
Solution 1 looks like a hacked solution, but it should work and the change is 
simple (we do need to move the expensive zk lock acquire/release inside the 
synchronization block.  - Note: we have already used isSingleSharedLock() to 
make decision on 'reset' parameter of release lock - I think there is a bug 
that acquire/release zk lock not inside synchronization block, because 
isSingleSharedLock() could change.)

- Solution 2: make a 'private HashMap<procedureId, TableLock > tableLock' to 
replace 'private TableLock tableLock' - now we track all locks.  The draw back 
is that when exclusive lock is used, a little bit more overhead by looking at 
hash table.
Solution 2 is more robust as we creates multiple shared lock znode to track 
each procedure.  But it is a little complicated and really there is no need to 
over-complicate the part of code that might not exist in a long term.  

According to [~mbertozzi], [~Abby] is looking to removing the zklock in Apache 
HBASE 2.0.  He would open a JIRA for the work.  

In the mean time, the V1 patch is using the proposal of Solution 1.

> Procedure V2: Support acquire/release shared table lock concurrently
> --------------------------------------------------------------------
>
>                 Key: HBASE-16233
>                 URL: https://issues.apache.org/jira/browse/HBASE-16233
>             Project: HBase
>          Issue Type: Sub-task
>          Components: proc-v2
>            Reporter: Stephen Yuan Jiang
>            Assignee: Stephen Yuan Jiang
>             Fix For: 2.0.0
>
>         Attachments: HBASE-16233.v1-master.patch
>
>
> {{MasterProcedureScheduler.TableQueue}} class only has one single instance of 
> TableLock ({{private TableLock tableLock = null;}}) to track exclusive/shared 
> table lock from TableLockManager.  
> When multiple shared lock request comes, the later shared lock request would 
> overwrite the lock acquired from earlier shared lock request, and hence, we 
> will get some weird error when the second or later release lock request 
> comes, because we lose track of the lock.
> The issue can be reproduced in the unit test of HBASE-14552.  [~mbertozzi] 
> also comes up with a UT without using any real procedure to repro the problem:
> {code}
> @Test
>   public void testSchedWithZkLock() throws Exception {
>     MiniZooKeeperCluster zkCluster = new MiniZooKeeperCluster(conf);
>     int zkPort = zkCluster.startup(new File("/tmp/test-zk"));
>     Thread.sleep(10000);
>     conf.set("hbase.zookeeper.quorum", "localhost:" + zkPort);
>     ZooKeeperWatcher zkw = new ZooKeeperWatcher(conf, "testSchedWithZkLock", 
> null, false);
>     queue = new MasterProcedureScheduler(conf,
>       TableLockManager.createTableLockManager(
>         conf, zkw, ServerName.valueOf("localhost", 12345, 1)));
>     final TableName tableName = TableName.valueOf("testtb");
>     TestTableProcedure procA = new TestTableProcedure(1, tableName,
>           TableProcedureInterface.TableOperationType.READ);
>     TestTableProcedure procB = new TestTableProcedure(2, tableName,
>           TableProcedureInterface.TableOperationType.READ);
>     assertTrue(queue.tryAcquireTableSharedLock(procA, tableName));
>     assertTrue(queue.tryAcquireTableSharedLock(procB, tableName));
>     queue.releaseTableSharedLock(procA, tableName);
>     queue.releaseTableSharedLock(procB, tableName);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to