[jira] Created: (HBASE-3382) Make HBase client work better under concurrent clients

ryan rawson (JIRA) Tue, 21 Dec 2010 14:30:23 -0800

Make HBase client work better under concurrent clients
------------------------------------------------------


                 Key: HBASE-3382
                 URL: https://issues.apache.org/jira/browse/HBASE-3382
             Project: HBase
          Issue Type: Bug
            Reporter: ryan rawson


The HBase client uses 1 socket per regionserver for communication.  This is 
good for socket control but potentially bad for latency.  How bad?  I did a 
simple YCSB test that had this config:

 readproportion=0
 updateproportion=0
 scanproportion=1
 insertproportion=0

 fieldlength=10
 fieldcount=100

 requestdistribution=zipfian
 scanlength=300
 scanlengthdistribution=zipfian


I ran this with 1 and 10 threads.  The summary is as so:

1 thread:
[SCAN]   Operations     1000
[SCAN]   AverageLatency(ms)     35.871

10 threads:
[SCAN]   Operations     1000
[SCAN]   AverageLatency(ms)     228.576

We are taking a 6.5x latency hit in our client.  But why?

First step was to move the deserialization out of the Connection thread, this 
seemed like it could have a big win, an analog change on the server side got a 
20% performance improvement (already commited as HBASE-2941).  I did this and 
got about a 20% improvement again, with that 228ms number going to about 190 
ms.  

So I then wrote a high performance nanosecond resolution tracing utility.  
Clients can flag an API call, and we get tracing and numbers through the client 
pipeline.  What I found is that a lot of time is being spent in receiving the 
response from the network.  The code block is like so:

        NanoProfiler.split(id, "receiveResponse");
        if (LOG.isDebugEnabled())
          LOG.debug(getName() + " got value #" + id);

        Call call = calls.get(id);

        size -= 4;  // 4 byte off for id because we already read it.

        ByteBuffer buf = ByteBuffer.allocate(size);

        IOUtils.readFully(in, buf.array(), buf.arrayOffset(), size);

        buf.limit(size);
        buf.rewind();
        NanoProfiler.split(id, "setResponse", "Data size: " + size);

I came up with some numbers:
11726 (receiveResponse) split: 64991689 overall: 133562895 Data size: 4288937
12163 (receiveResponse) split: 32743954 overall: 103787420 Data size: 1606273
12561 (receiveResponse) split: 3517940 overall: 83346740 Data size: 4
12136 (receiveResponse) split: 64448701 overall: 203872573 Data size: 3570569

The first number is the internal counter for keeping requests unique from 
HTable on down.  The numbers are in ns, the data size is in bytes.

Doing some simple calculations, we see for the first line we were reading at 
about 31 MB/sec.  The second one is even worse.  Other calls are like:

26 (receiveResponse) split: 7985400 overall: 21546226 Data size: 850429

which is 107 MB/sec which is pretty close to the maximum of gige.  In my set 
up, the ycsb client ran on the master node and HAD to use network to talk to 
regionservers.

Even at full line rate, we could still see unacceptable hold ups of unrelated 
calls that just happen to need to talk to the same regionserver.

This issue is about these findings, what to do, how to improve. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HBASE-3382) Make HBase client work better under concurrent clients

Reply via email to