Provide Client API to explicitly lock and unlock rows
-----------------------------------------------------
Key: HBASE-798
URL: https://issues.apache.org/jira/browse/HBASE-798
Project: Hadoop HBase
Issue Type: New Feature
Components: client, ipc, regionserver
Reporter: Jonathan Gray
Priority: Minor
We need to be able to perform a series of reads from and writes to a single row
without any potential interference from other clients.
Unfortunately this is a bit involved because normal reads currently do not
acquire row locks, so it requires adding additional get/getRow calls that
obtain and release a row lock.
In addition, there will be two additional client calls, lockRow/unlockRow,
which actually acquire and release the locks. Though each lock is associated
with an HRegion, this will be tracked within the HRegionServer. When a lock is
acquired from the client, it is handled much like a Scanner. We obtain the row
lock from the HRegion, store the region name and lock identifier in a
synchronized Map, and also obtain a lease to ensure that the lock will
eventually be released even if the client dies.
This also required adding a RowLockListener (implements LeaseListener) private
class in HRS to handle row lock lease expiration.
HRS.lockRow will return a long lockId (as openScanner does) that will be used
in subsequent client calls to reuse this existing row lock. These calls will
check that the lock is valid and perform the operations without any locking
(wrappers around get*, new versions of batchUpdate, openScanner, etc).
This is going to really add some noise to the list of available HTable/client
methods so I'm not sure if it's something people would want to commit into a
normal release. Regardless this does provide some very convenient
functionality that may be useful to others.
We are also looking into Clint Morgan's HBASE-669, but one major downside is
that there is a significant amount of overhead involved. This row locking is
already built in and this will only extend the API to allow clients to work
with them directly. There is little to no overhead at all. The only (obvious)
performance consideration is that this should only used where necessary as rows
will not be locked and unlocked as quickly with round-trip client calls. In
our design, we will have specific notes in our schema about which tables (or
even which families or columns) must be accessed with row locks at all times
and which do not.
This is my first attempt at adding any additional functionality, so comments,
criticism, code reviews are encouraged.
I should have a patch up tomorrow.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.