Re: Problems when executing many (?) HTable.lockRow()

Ryan Rawson Thu, 14 May 2009 19:06:00 -0700

Given the non core nature, I think the api should potentially facilitate
this but the code should be contrib.

On May 14, 2009 5:32 PM, "Guilherme Germoglio" <[email protected]> wrote:

On Thu, May 14, 2009 at 3:40 PM, stack <[email protected]> wrote: > No
consideration has been made f...
I think so.

If nothing is to be changed on RowLock class, we could use the following
approach:

Considering the line as is today:

*RowLock lock = htable.lockRow(row);*

*htable.lockRow(row)*, instead of contacting the regionserver and requesting
a lock for the given row, it would contact zk for a lock, just as the lock
recipe<
http://hadoop.apache.org/zookeeper/docs/current/recipes.html#sc_recipes_Locks
>[1].
Notice that it will be waiting for the lock just as it does today, the
only difference is that regionserver resources (e.g., RPC thread) aren't
used. After it receives the lock, htable would: (1) randomly generate a
lockid, (2) put an entry in a Map<lockid, zk node pathname>, (3) create a
RowLock using the lockid and (4) return the method.

>From this point, any operation could be performed even without passing
rowlock as parameter, zookeeper + the implementation of the lock recipe in
htable are now ensuring that no other client would be performing any
operation concurrently on the given row. [2]

Finally, *htable.unlockRow(lock)* must be invoked, which would make Htable
delete the matching zk node (remember the Map<lockid, zk node pathname>).

One good thing of this approach is that we don't have to worry about lock
leases: if the client dies, zk will notice at some point in the future and
release the lock. And if the client forgets to unlock the row, its code is
wrong. (:

However, if we are to redesign the API, I would propose write and read
locks. Then htable would have two methods: HTable.lockRowForRead(row),
HTable.lockRowForWrite(row) [3] and the lock recipe to be implemented would
be the Shared Locks
recipe<
http://hadoop.apache.org/zookeeper/docs/current/recipes.html#Shared+Locks>.

[1] We may design carefully how the locknode would be created according to
zk constraints on how many nodes it can manage in a single directory. Maybe
we should do something like:
/hbase/locks/table-name/hash-function(row)/row/{read-, write-}

[2] I agree that we are not protecting ourselves from a malicious client
using HTable, who could simply "forget" to request the lock for the given
row and then mess everything. But this is how it's everywhere, isn't it?

[3] Suggest better method names, please!

> St.Ack > > > On Thu, May 14, 2009 at 9:44 AM, Guilherme Germoglio <
[email protected] > >wrote:...
--

Guilherme msn: [email protected] homepage:
http://germoglio.googlepages.com

Re: Problems when executing many (?) HTable.lockRow()

Reply via email to