[ 
https://issues.apache.org/jira/browse/HBASE-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973948#action_12973948
 ] 

ryan rawson commented on HBASE-3382:
------------------------------------

There are a number of issues going on here, and several solutions for them:

- During scans we need at least 3 RPCs just to do work.  If a lighter weight 
RPC (eg: scanner.close) gets stuck behind a big response that will increase the 
latency.
- The reads from the network seem really slow. Ping time is .1 ms (100 
microsec!).  I tried to set a 256k send/recv buffer on both sides but that was 
ineffective.
- Even at the theoretical maximum 110-120MB/sec gige speed, we would have to 
wait 41 ms to receive 5 MB in the best case.  Without chunking and interleaving 
of responses we will be held up behind the big responses.
- We are using old io, its possible the new io APIs are more efficient at 
reading chunks of data.  I am not sure though, and it's hard to find info about 
this online.

So there are a few avenues to investigate:
- reduction of RPC calls for scans
- nio instead of oio for better efficiency of reads
- some way of interleaving responses, either via multiple sockets or chunking 
on the wire to interleave responses.


> Make HBase client work better under concurrent clients
> ------------------------------------------------------
>
>                 Key: HBASE-3382
>                 URL: https://issues.apache.org/jira/browse/HBASE-3382
>             Project: HBase
>          Issue Type: Bug
>            Reporter: ryan rawson
>         Attachments: HBASE-3382.txt
>
>
> The HBase client uses 1 socket per regionserver for communication.  This is 
> good for socket control but potentially bad for latency.  How bad?  I did a 
> simple YCSB test that had this config:
>  readproportion=0
>  updateproportion=0
>  scanproportion=1
>  insertproportion=0
>  fieldlength=10
>  fieldcount=100
>  requestdistribution=zipfian
>  scanlength=300
>  scanlengthdistribution=zipfian
> I ran this with 1 and 10 threads.  The summary is as so:
> 1 thread:
> [SCAN]         Operations     1000
> [SCAN]         AverageLatency(ms)     35.871
> 10 threads:
> [SCAN]         Operations     1000
> [SCAN]         AverageLatency(ms)     228.576
> We are taking a 6.5x latency hit in our client.  But why?
> First step was to move the deserialization out of the Connection thread, this 
> seemed like it could have a big win, an analog change on the server side got 
> a 20% performance improvement (already commited as HBASE-2941).  I did this 
> and got about a 20% improvement again, with that 228ms number going to about 
> 190 ms.  
> So I then wrote a high performance nanosecond resolution tracing utility.  
> Clients can flag an API call, and we get tracing and numbers through the 
> client pipeline.  What I found is that a lot of time is being spent in 
> receiving the response from the network.  The code block is like so:
>         NanoProfiler.split(id, "receiveResponse");
>         if (LOG.isDebugEnabled())
>           LOG.debug(getName() + " got value #" + id);
>         Call call = calls.get(id);
>         size -= 4;  // 4 byte off for id because we already read it.
>         ByteBuffer buf = ByteBuffer.allocate(size);
>         IOUtils.readFully(in, buf.array(), buf.arrayOffset(), size);
>         buf.limit(size);
>         buf.rewind();
>         NanoProfiler.split(id, "setResponse", "Data size: " + size);
> I came up with some numbers:
> 11726 (receiveResponse) split: 64991689 overall: 133562895 Data size: 4288937
> 12163 (receiveResponse) split: 32743954 overall: 103787420 Data size: 1606273
> 12561 (receiveResponse) split: 3517940 overall: 83346740 Data size: 4
> 12136 (receiveResponse) split: 64448701 overall: 203872573 Data size: 3570569
> The first number is the internal counter for keeping requests unique from 
> HTable on down.  The numbers are in ns, the data size is in bytes.
> Doing some simple calculations, we see for the first line we were reading at 
> about 31 MB/sec.  The second one is even worse.  Other calls are like:
> 26 (receiveResponse) split: 7985400 overall: 21546226 Data size: 850429
> which is 107 MB/sec which is pretty close to the maximum of gige.  In my set 
> up, the ycsb client ran on the master node and HAD to use network to talk to 
> regionservers.
> Even at full line rate, we could still see unacceptable hold ups of unrelated 
> calls that just happen to need to talk to the same regionserver.
> This issue is about these findings, what to do, how to improve. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to