Thank you Doug. I still have one confusion left. My original question is, why batch update could resolve the performance (or make improvement) issue caused by same row update contention by multiple clients. Do you have any ideas or comments?
regards, Lin On Fri, Sep 7, 2012 at 2:26 AM, Doug Meil <[email protected]>wrote: > > For the 2nd part of the question, if you have 10 Puts it's more > efficient to send a single RS message with 10 Puts than send 10 RS messages > with 1 Put apiece. There are 2 words to be careful with, and those are > "always" and "never", because there is an exception: if you are using the > client writeBuffer and each of those 10 Puts are going to a different > RegionServer, then you haven't really gained much. > > To answer the next question of how you know where the Puts are going, > see this method… > > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#getRegionLocation%28byte[],%20boolean%29 > > Because the Hbase client talks directly to each RS, it has to know the > region boundaries. > > > > From: Lin Ma <[email protected]> > Date: Thursday, September 6, 2012 11:54 AM > To: "[email protected]" <[email protected]>, Doug Meil < > [email protected]> > Cc: "[email protected]" <[email protected]> > Subject: Re: batch update question > > Thank you Doug, > > Very effective reply. :-) > > - why batch update could resolve contention issue on the same row? Could > you elaborate a bit more or show me an example? > - Batch update always have good performance compared to single update > (when we measure total throughput)? > > regards, > Lin > > On Thu, Sep 6, 2012 at 12:59 AM, Doug Meil > <[email protected]>wrote: > >> >> Hi there, if you look in the source code for HTable there is a list of Put >> objects. That's the buffer, and it's a client-side buffer. >> >> >> >> >> >> On 9/5/12 12:04 PM, "Lin Ma" <[email protected]> wrote: >> >> >Thank you Stack for the details directions! >> > >> >1. You are right, I have not met with any real row contention issues. My >> >purpose is understanding the issue in advance, and also from this issue >> to >> >understand HBase generals better; >> >2. For the comments from API Url page you referred -- "If >> >isAutoFlush< >> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client >> >/HTableInterface.html#isAutoFlush%28%29>is >> >false, the update is buffered until the internal buffer is full.", I >> >am >> >confused what is the buffer? Buffer at client side or buffer in region >> >server? Is there a way to configure its size to hold until flushing? >> >3. Why batch could resolve contention on the same raw issue in theory, >> >compared to non-batch operation? Besides preparation the solution in my >> >mind in advance, I want to learn a bit about why. :-) >> > >> >regards, >> >Lin >> > >> >On Wed, Sep 5, 2012 at 4:00 AM, Stack <[email protected]> wrote: >> > >> >> On Sun, Sep 2, 2012 at 2:13 AM, Lin Ma <[email protected]> wrote: >> >> > Hello guys, >> >> > >> >> > I am reading the book "HBase, the definitive guide", at the beginning >> >>of >> >> > chapter 3, it is mentioned in order to reduce performance impact for >> >> > clients to update the same row (lock contention issues for automatic >> >> > write), batch update is preferred. My questions is, for MR job, what >> >>are >> >> > the batch update methods we could leverage to resolve the issue? And >> >>for >> >> > API client, what are the batch update methods we could leverage to >> >> resolve >> >> > the issue? >> >> > >> >> >> >> Do you actually have a problem where there is contention on a single >> >>row? >> >> >> >> Use methods like >> >> >> >> >> >> >> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.htm >> >>l#put(java.util.List) >> >> or the batch methods listed earlier in the API. You should set >> >> autoflush to false too: >> >> >> >> >> >> >> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTableInte >> >>rface.html#isAutoFlush() >> >> >> >> Even batching, a highly contended row might hold up inserts... but for >> >> sure you actually have this problem in the first place? >> >> >> >> St.Ack >> >> >> >> >> >
