Re: Problems when executing many (?) HTable.lockRow()

Guilherme Germoglio Mon, 18 May 2009 07:26:16 -0700

hello,

what is the current state of zk in hbase? and what will it be on version
0.20?  I mean "are/will we able to run hbase without zk?"


Because I think that making rowlocks perform well is essential for users and
also maybe this approach will make the code simpler, but I don't know what
are the drawbacks of creating such a dependency on zk.

thanks,

On Fri, May 15, 2009 at 2:40 AM, Nitay <[email protected]> wrote:

> I like this a lot Guilherme. Perhaps we should open a JIRA with them so we
> can track these great ideas.
>
> On Thu, May 14, 2009 at 7:05 PM, Ryan Rawson <[email protected]> wrote:
>
> > Given the non core nature, I think the api should potentially facilitate
> > this but the code should be contrib.
> >
> > On May 14, 2009 5:32 PM, "Guilherme Germoglio" <[email protected]>
> > wrote:
> >
> > On Thu, May 14, 2009 at 3:40 PM, stack <[email protected]> wrote: > No
> > consideration has been made f...
> > I think so.
> >
> > If nothing is to be changed on RowLock class, we could use the following
> > approach:
> >
> > Considering the line as is today:
> >
> > *RowLock lock = htable.lockRow(row);*
> >
> > *htable.lockRow(row)*, instead of contacting the regionserver and
> > requesting
> > a lock for the given row, it would contact zk for a lock, just as the
> lock
> > recipe<
> >
> >
> http://hadoop.apache.org/zookeeper/docs/current/recipes.html#sc_recipes_Locks
> > >[1].
> > Notice that it will be waiting for the lock just as it does today, the
> > only difference is that regionserver resources (e.g., RPC thread) aren't
> > used. After it receives the lock, htable would: (1) randomly generate a
> > lockid, (2) put an entry in a Map<lockid, zk node pathname>, (3) create a
> > RowLock using the lockid and (4) return the method.
> >
> > From this point, any operation could be performed even without passing
> > rowlock as parameter, zookeeper + the implementation of the lock recipe
> in
> > htable are now ensuring that no other client would be performing any
> > operation concurrently on the given row. [2]
> >
> > Finally, *htable.unlockRow(lock)* must be invoked, which would make
> Htable
> > delete the matching zk node (remember the Map<lockid, zk node pathname>).
> >
> > One good thing of this approach is that we don't have to worry about lock
> > leases: if the client dies, zk will notice at some point in the future
> and
> > release the lock. And if the client forgets to unlock the row, its code
> is
> > wrong. (:
> >
> > However, if we are to redesign the API, I would propose write and read
> > locks. Then htable would have two methods: HTable.lockRowForRead(row),
> > HTable.lockRowForWrite(row) [3] and the lock recipe to be implemented
> would
> > be the Shared Locks
> > recipe<
> >
> http://hadoop.apache.org/zookeeper/docs/current/recipes.html#Shared+Locks
> > >.
> >
> >
> > [1] We may design carefully how the locknode would be created according
> to
> > zk constraints on how many nodes it can manage in a single directory.
> Maybe
> > we should do something like:
> > /hbase/locks/table-name/hash-function(row)/row/{read-, write-}
> >
> > [2] I agree that we are not protecting ourselves from a malicious client
> > using HTable, who could simply "forget" to request the lock for the given
> > row and then mess everything. But this is how it's everywhere, isn't it?
> >
> > [3] Suggest better method names, please!
> >
> > > St.Ack > > > On Thu, May 14, 2009 at 9:44 AM, Guilherme Germoglio <
> > [email protected] > >wrote:...
> > --
> >
> > Guilherme msn: [email protected] homepage:
> > http://germoglio.googlepages.com
> >
>



-- 
Guilherme

msn: [email protected]
homepage: http://germoglio.googlepages.com

Re: Problems when executing many (?) HTable.lockRow()

Reply via email to