[
https://issues.apache.org/jira/browse/HBASE-11355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033389#comment-14033389
]
Liang Xie commented on HBASE-11355:
-----------------------------------
I don't have a normal 0.94 patch, it's a preliminary hack. Other hotspots
includes: responseQueuesSizeThrottler, rpcMetrics, scannerReadPoints, etc.
The minor change about callQueue like below(we had seperated the original
callQueue into readCallQueue and writeCallQueue):
{code}
- protected BlockingQueue<Call> readCallQueue; // read queued calls
+ protected List<BlockingQueue<Call>> readCallQueues; // read queued calls
...
- boolean success = readCallQueue.offer(call);
+ boolean success =
readCallQueues.get(rand.nextInt(readHandlerCount)).offer(call);
...
- this.readCallQueue = new LinkedBlockingQueue<Call>(readQueueLength);
+ this.readHandlerCount = Math.round(readQueueRatio * handlerCount);
+ this.readCallQueues = new LinkedList<BlockingQueue<Call>>();
+ for (int i=0; i< readHandlerCount; i++) {
+ readCallQueues.add(new LinkedBlockingQueue<Call>(readQueueLength)) ;
+ }
{code}
Every handler thread will consume its own queue, to eliminate the severe
contention.
If considering correctness or more resource consumption, another call queue
sharding solution here probably is introducing a queue number setting(i just
took handler number for simplify to get a raw perf number), and letting all
requests from same client go to the same queue always.
> a couple of callQueue related improvements
> ------------------------------------------
>
> Key: HBASE-11355
> URL: https://issues.apache.org/jira/browse/HBASE-11355
> Project: HBase
> Issue Type: Improvement
> Components: IPC/RPC
> Affects Versions: 0.99.0, 0.94.20
> Reporter: Liang Xie
> Assignee: Matteo Bertozzi
>
> In one of my in-memory read only testing(100% get requests), one of the top
> scalibility bottleneck came from the single callQueue. A tentative sharing
> this callQueue according to the rpc handler number showed a big throughput
> improvement(the original get() qps is around 60k, after this one and other
> hotspot tunning, i got 220k get() qps in the same single region server) in a
> YCSB read only scenario.
> Another stuff we can do is seperating the queue into read call queue and
> write call queue, we had done it in our internal branch, it would helpful in
> some outages, to avoid all read or all write requests ran out of all handler
> threads.
> One more stuff is changing the current blocking behevior once the callQueue
> is full, considering the full callQueue almost means the backend processing
> is slow somehow, so a fail-fast here should be more reasonable if we using
> HBase as a low latency processing system. see "callQueue.put(call)"
--
This message was sent by Atlassian JIRA
(v6.2#6252)