Provide Client API to explicitly lock and unlock rows
-----------------------------------------------------

                 Key: HBASE-798
                 URL: https://issues.apache.org/jira/browse/HBASE-798
             Project: Hadoop HBase
          Issue Type: New Feature
          Components: client, ipc, regionserver
            Reporter: Jonathan Gray
            Priority: Minor


We need to be able to perform a series of reads from and writes to a single row 
without any potential interference from other clients.

Unfortunately this is a bit involved because normal reads currently do not 
acquire row locks, so it requires adding additional get/getRow calls that 
obtain and release a row lock.

In addition, there will be two additional client calls, lockRow/unlockRow, 
which actually acquire and release the locks.  Though each lock is associated 
with an HRegion, this will be tracked within the HRegionServer.  When a lock is 
acquired from the client, it is handled much like a Scanner.  We obtain the row 
lock from the HRegion, store the region name and lock identifier in a 
synchronized Map, and also obtain a lease to ensure that the lock will 
eventually be released even if the client dies.

This also required adding a RowLockListener (implements LeaseListener) private 
class in HRS to handle row lock lease expiration.

HRS.lockRow will return a long lockId (as openScanner does) that will be used 
in subsequent client calls to reuse this existing row lock.  These calls will 
check that the lock is valid and perform the operations without any locking  
(wrappers around get*, new versions of batchUpdate, openScanner, etc).

This is going to really add some noise to the list of available HTable/client 
methods so I'm not sure if it's something people would want to commit into a 
normal release.  Regardless this does provide some very convenient 
functionality that may be useful to others.

We are also looking into Clint Morgan's HBASE-669, but one major downside is 
that there is a significant amount of overhead involved.  This row locking is 
already built in and this will only extend the API to allow clients to work 
with them directly.  There is little to no overhead at all.  The only (obvious) 
performance consideration is that this should only used where necessary as rows 
will not be locked and unlocked as quickly with round-trip client calls.  In 
our design, we will have specific notes in our schema about which tables (or 
even which families or columns) must be accessed with row locks at all times 
and which do not.

This is my first attempt at adding any additional functionality, so comments, 
criticism, code reviews are encouraged.

I should have a patch up tomorrow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to