[
https://issues.apache.org/jira/browse/HBASE-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266313#comment-14266313
]
Cosmin Lehene commented on HBASE-3382:
--------------------------------------
[~ryanobjc], [~apurtell] this is an improvement, can this be
closed/transformed in a set of updated issues if still the case?
> Make HBase client work better under concurrent clients
> ------------------------------------------------------
>
> Key: HBASE-3382
> URL: https://issues.apache.org/jira/browse/HBASE-3382
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Reporter: ryan rawson
> Assignee: ryan rawson
> Labels: delete
> Attachments: HBASE-3382-nio.txt, HBASE-3382.txt
>
>
> The HBase client uses 1 socket per regionserver for communication. This is
> good for socket control but potentially bad for latency. How bad? I did a
> simple YCSB test that had this config:
> readproportion=0
> updateproportion=0
> scanproportion=1
> insertproportion=0
> fieldlength=10
> fieldcount=100
> requestdistribution=zipfian
> scanlength=300
> scanlengthdistribution=zipfian
> I ran this with 1 and 10 threads. The summary is as so:
> 1 thread:
> [SCAN] Operations 1000
> [SCAN] AverageLatency(ms) 35.871
> 10 threads:
> [SCAN] Operations 1000
> [SCAN] AverageLatency(ms) 228.576
> We are taking a 6.5x latency hit in our client. But why?
> First step was to move the deserialization out of the Connection thread, this
> seemed like it could have a big win, an analog change on the server side got
> a 20% performance improvement (already commited as HBASE-2941). I did this
> and got about a 20% improvement again, with that 228ms number going to about
> 190 ms.
> So I then wrote a high performance nanosecond resolution tracing utility.
> Clients can flag an API call, and we get tracing and numbers through the
> client pipeline. What I found is that a lot of time is being spent in
> receiving the response from the network. The code block is like so:
> NanoProfiler.split(id, "receiveResponse");
> if (LOG.isDebugEnabled())
> LOG.debug(getName() + " got value #" + id);
> Call call = calls.get(id);
> size -= 4; // 4 byte off for id because we already read it.
> ByteBuffer buf = ByteBuffer.allocate(size);
> IOUtils.readFully(in, buf.array(), buf.arrayOffset(), size);
> buf.limit(size);
> buf.rewind();
> NanoProfiler.split(id, "setResponse", "Data size: " + size);
> I came up with some numbers:
> 11726 (receiveResponse) split: 64991689 overall: 133562895 Data size: 4288937
> 12163 (receiveResponse) split: 32743954 overall: 103787420 Data size: 1606273
> 12561 (receiveResponse) split: 3517940 overall: 83346740 Data size: 4
> 12136 (receiveResponse) split: 64448701 overall: 203872573 Data size: 3570569
> The first number is the internal counter for keeping requests unique from
> HTable on down. The numbers are in ns, the data size is in bytes.
> Doing some simple calculations, we see for the first line we were reading at
> about 31 MB/sec. The second one is even worse. Other calls are like:
> 26 (receiveResponse) split: 7985400 overall: 21546226 Data size: 850429
> which is 107 MB/sec which is pretty close to the maximum of gige. In my set
> up, the ycsb client ran on the master node and HAD to use network to talk to
> regionservers.
> Even at full line rate, we could still see unacceptable hold ups of unrelated
> calls that just happen to need to talk to the same regionserver.
> This issue is about these findings, what to do, how to improve.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)