Hi, Just wanted to update that the fixes suggested are in hbase branch. Please let me know if further improvements are possible. Changes made as per suggestions- * Use concurrent hash map * Use AtomicLong * Remove lock
Result is using HBase test util 3000 row auto-increment+put+get takes 4~4.5s (performance increased by 90% by getting using concurrent API) and without auto increment 1.5s. Thanks a lot, Imran On Wed, Oct 20, 2010 at 9:10 AM, Imran M Yousuf <[email protected]> wrote: > Hi Ryan, > > Thanks a lot for your feedback, please find some clarifications and > queries inline below. > > On Wed, Oct 20, 2010 at 6:41 AM, Ryan Rawson <[email protected]> wrote: >> One should never* call lockRow(), and prefer to do something else >> instead. CheckAndPut works like CompareAndSet (we just call it Put >> since that is what you are doing in our API, putting), and there is >> also the incrementColumnValue() call. >> > > I did see the incrementColumnValue in HTableInterface, but I wanted to > avoid needing to perform an additional operation on HBase for an > insert not knowing its performance issues; have you used it and would > you recommend it? Singleton with AtomicLong would solve the problem > without any HBase operation being needed, what do you think about > that? > >> I'm not really following your code (I'm also sick), but why not just >> do something like this: > > Praying for your speedy recovery. > >> - Table: Sequences >> rowid: table_name column: id value: sequence >> >> So you just call: >> table.incrementColumnValue("Sequences", >> tableNameThatYouWantSequenceFor, "id", 1); >> >> and the result is your sequence id to use as a primary key. No need >> to worry about non-existant values, the call creates the value, so the >> sequence starts at 1 always. >> >> -ryan >> >> * ok you can call lockRow, but be aware that your milage may vary, you >> reduce the performance of HBase, and generally can cause a lot of >> problems. Eg: you can DOS yourself! >> > > Yes the DOS is a worrying issue, since I faced it due a bug in my code > in test where I did not unlock a row upon PUT, so planning to avoid it > all together. > > Thank you, > > Imran > >> On Tue, Oct 19, 2010 at 12:39 PM, tsuna <[email protected]> wrote: >>> I would like to add that you can probably get rid of RowLock and use >>> checkAndPut instead to atomically create the row if it doesn't already >>> exist. This would probably solve the last problem I outlined where 2 >>> different instances of your web service attempt to assign the same ID >>> at the same time. The code would also be simpler and more efficient. >>> >>> -- >>> Benoit "tsuna" Sigoure >>> Software Engineer @ www.StumbleUpon.com >>> >> > > > > -- > Imran M Yousuf > Entrepreneur & CEO > Smart IT Engineering Ltd. > Dhaka, Bangladesh > Twitter: @imyousuf - http://twitter.com/imyousuf > Blog: http://imyousuf-tech.blogs.smartitengineering.com/ > Mobile: +880-1711402557 > -- Imran M Yousuf Twitter: @imyousuf - http://twitter.com/imyousuf Blog: http://imyousuf-tech.blogs.smartitengineering.com/ Mobile: +880-1711402557
