hello, what is the current state of zk in hbase? and what will it be on version 0.20? I mean "are/will we able to run hbase without zk?"
Because I think that making rowlocks perform well is essential for users and also maybe this approach will make the code simpler, but I don't know what are the drawbacks of creating such a dependency on zk. thanks, On Fri, May 15, 2009 at 2:40 AM, Nitay <[email protected]> wrote: > I like this a lot Guilherme. Perhaps we should open a JIRA with them so we > can track these great ideas. > > On Thu, May 14, 2009 at 7:05 PM, Ryan Rawson <[email protected]> wrote: > > > Given the non core nature, I think the api should potentially facilitate > > this but the code should be contrib. > > > > On May 14, 2009 5:32 PM, "Guilherme Germoglio" <[email protected]> > > wrote: > > > > On Thu, May 14, 2009 at 3:40 PM, stack <[email protected]> wrote: > No > > consideration has been made f... > > I think so. > > > > If nothing is to be changed on RowLock class, we could use the following > > approach: > > > > Considering the line as is today: > > > > *RowLock lock = htable.lockRow(row);* > > > > *htable.lockRow(row)*, instead of contacting the regionserver and > > requesting > > a lock for the given row, it would contact zk for a lock, just as the > lock > > recipe< > > > > > http://hadoop.apache.org/zookeeper/docs/current/recipes.html#sc_recipes_Locks > > >[1]. > > Notice that it will be waiting for the lock just as it does today, the > > only difference is that regionserver resources (e.g., RPC thread) aren't > > used. After it receives the lock, htable would: (1) randomly generate a > > lockid, (2) put an entry in a Map<lockid, zk node pathname>, (3) create a > > RowLock using the lockid and (4) return the method. > > > > From this point, any operation could be performed even without passing > > rowlock as parameter, zookeeper + the implementation of the lock recipe > in > > htable are now ensuring that no other client would be performing any > > operation concurrently on the given row. [2] > > > > Finally, *htable.unlockRow(lock)* must be invoked, which would make > Htable > > delete the matching zk node (remember the Map<lockid, zk node pathname>). > > > > One good thing of this approach is that we don't have to worry about lock > > leases: if the client dies, zk will notice at some point in the future > and > > release the lock. And if the client forgets to unlock the row, its code > is > > wrong. (: > > > > However, if we are to redesign the API, I would propose write and read > > locks. Then htable would have two methods: HTable.lockRowForRead(row), > > HTable.lockRowForWrite(row) [3] and the lock recipe to be implemented > would > > be the Shared Locks > > recipe< > > > http://hadoop.apache.org/zookeeper/docs/current/recipes.html#Shared+Locks > > >. > > > > > > [1] We may design carefully how the locknode would be created according > to > > zk constraints on how many nodes it can manage in a single directory. > Maybe > > we should do something like: > > /hbase/locks/table-name/hash-function(row)/row/{read-, write-} > > > > [2] I agree that we are not protecting ourselves from a malicious client > > using HTable, who could simply "forget" to request the lock for the given > > row and then mess everything. But this is how it's everywhere, isn't it? > > > > [3] Suggest better method names, please! > > > > > St.Ack > > > On Thu, May 14, 2009 at 9:44 AM, Guilherme Germoglio < > > [email protected] > >wrote:... > > -- > > > > Guilherme msn: [email protected] homepage: > > http://germoglio.googlepages.com > > > -- Guilherme msn: [email protected] homepage: http://germoglio.googlepages.com
