In addition to what everybody else said, look what *where* the regions are for the target table. There may be 5 regions (for example), but look to see if they are all on the same RS.
On 1/6/14 5:45 AM, "Nicolas Liochon" <[email protected]> wrote: >It's very strange that you don't see a perf improvement when you increase >the number of nodes. >Nothing in what you've done change the performances at the end? > >You may want to check: > - the number of regions for this table. Are all the region server busy? >Do >you have some split on the table? > - How much data you actually write. Is the compression enabled on this >table? > - Do you have compactions? You may want to change the max store file >settings for unfrequent write load (see >http://gbif.blogspot.fr/2012/07/optimizing-writes-in-hbase.html). > >It would be interesting to test as well the 0.96 release. > > > >On Sun, Jan 5, 2014 at 2:12 AM, Vladimir Rodionov ><[email protected]>wrote: > >> >> I think in this case, writing data to HDFS or HFile directly (for >> subsequent bulk loading) >> is the best option. HBase will never compete in write speed with HDFS. >> >> Best regards, >> Vladimir Rodionov >> Principal Platform Engineer >> Carrier IQ, www.carrieriq.com >> e-mail: [email protected] >> >> ________________________________________ >> From: Ted Yu [[email protected]] >> Sent: Saturday, January 04, 2014 2:33 PM >> To: [email protected] >> Subject: Re: Hbase Performance Issue >> >> There're 8 items under: >> http://hbase.apache.org/book.html#perf.writing >> >> I guess you have through all of them :-) >> >> >> On Sat, Jan 4, 2014 at 1:34 PM, Akhtar Muhammad Din >> <[email protected]>wrote: >> >> > Thanks guys for your precious time. >> > Vladimir, as Ted rightly said i want to improve write performance >> currently >> > (of course i want to read data as fast as possible later on) >> > Kevin, my current understanding of bulk load is that you generate >> > StoreFiles and later load through a command line program. I dont want >>to >> do >> > any manual step. Our system is getting data after every 15 minutes, so >> > requirement is to automate it through client API completely. >> > >> > >> >> Confidentiality Notice: The information contained in this message, >> including any attachments hereto, may be confidential and is intended >>to be >> read only by the individual or entity to whom this message is >>addressed. If >> the reader of this message is not the intended recipient or an agent or >> designee of the intended recipient, please note that any review, use, >> disclosure or distribution of this message or its attachments, in any >>form, >> is strictly prohibited. If you have received this message in error, >>please >> immediately notify the sender and/or [email protected] and >> delete or destroy any copy of this message and its attachments. >>
