Arrghhh. I meant *not* the case.
________________________________ From: lars hofhansl <[email protected]> To: "[email protected]" <[email protected]> Sent: Saturday, March 2, 2013 12:50 PM Subject: Re: HBase Thrift inserts bottlenecked somewhere -- but where? Dan already established that that is the case. ________________________________ From: Ted Yu <[email protected]> To: [email protected] Sent: Saturday, March 2, 2013 12:02 PM Subject: Re: HBase Thrift inserts bottlenecked somewhere -- but where? Asaf made a good point. See this JIRA where Nick did similar optimization: HBASE-7747 Import tools should use a combiner to merge Puts Cheers On Sat, Mar 2, 2013 at 11:56 AM, Asaf Mesika <[email protected]> wrote: > Make sure you are not sending a lot of put of the same rowkey. This can > cause contention in the region server side. We fixed that in our project by > aggregating all the columns for the same rowkey into the same Put object > thus when sending List of Put we made sure each Put has a unique rowkey. > > On Saturday, March 2, 2013, Dan Crosta wrote: > > > On Mar 2, 2013, at 12:38 PM, lars hofhansl wrote: > > > "That's only true from the HDFS perspective, right? Any given region is > > > "owned" by 1 of the 6 regionservers at any given time, and writes are > > > buffered to memory before being persisted to HDFS, right?" > > > > > > Only if you disabled the WAL, otherwise each change is written to the > > WAL first, and then committed to the memstore. > > > So in the sense it's even worse. Each edit is written twice to the FS, > > replicated 3 times, and all that only 6 data nodes. > > > > Are these writes synchronized somehow? Could there be a locking problem > > somewhere that wouldn't show up as utilization of disk or cpu? > > > > What is the upshot of disabling WAL -- I assume it means that if a > > RegionServer crashes, you lose any writes that it has in memory but not > > committed to HFiles? > > > > > > > 20k writes does seem a bit low. > > > > I adjusted dfs.datanode.handler.count from 3 to 10 and now we're up to > > about 22-23k writes per second, but still no apparent contention for any > of > > the basic system resources. > > > > Any other suggestions on things to try? > > > > Thanks, > > - Dan >
