Thomas: Have you set tcpnodelay to true ? See http://hbase.apache.org/book.html for explanation of hbase.ipc.client.tcpnodelay
Cheers On Thu, Aug 14, 2014 at 11:41 AM, Thomas Kwan <[email protected]> wrote: > Hi Esteban, > > Thanks for sharing ideas. > > We are on Hbase 0.96 and java 1.6. I have enabled short-circuit read, > and heap size is around 16G for each region server. We have about 20 > of them. > > The list of rowkeys that I need to process is about 10M. I am using > batch gets already and the batch size is ~2000 gets. > > thomas > > On Thu, Aug 14, 2014 at 11:01 AM, Esteban Gutierrez > <[email protected]> wrote: > > Hello Thomas, > > > > What version of HBase are you using? sorting and grouping based on the > > regions the rows is going to help for sure. I don't think you should > focus > > too much in the locality side of the problem unless your HDFS input set > is > > too large (100s or 1000s of MBs per task), otherwise it might be faster > to > > load in-memory the input dataset and do the batched calls. As discussed > in > > this mailing list recently there are too many factors that might be > > involved in the performance: number of threads or tasks, size of the row, > > RS resources, configurations, etc. so any additional info would be very > > helpful. > > > > cheers, > > esteban. > > > > > > > > > > -- > > Cloudera, Inc. > > > > > > > > On Thu, Aug 14, 2014 at 10:32 AM, Thomas Kwan <[email protected]> > > wrote: > > > >> Hi there > >> > >> I have a use-case where I need to do a read to check if a hbase entry > >> is present, then I do a put to create the entry when it is not there. > >> > >> I have a script to get a list of rowkeys from hive and put them on a > >> HDFS directory. Then I have a MR job that reads the rowkeys and do > >> batch reads. I am getting around 1.5K requests per second. > >> > >> To attempt to make this faster, I am wondering if I can > >> > >> - sort and group the rowkeys based on regions > >> - make the MR jobs run on regions that have the data locally > >> > >> Scan or TableInputFormat must have some codes to do something similar > >> right? > >> > >> thanks > >> thomas > >> >
