[
https://issues.apache.org/jira/browse/HBASE-12684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14265184#comment-14265184
]
stack commented on HBASE-12684:
-------------------------------
The async rpc client is running slower, about 15% less throughput (see below
for more detail). Async client is doing await on future/promise or we are
allocating a direct buffer. You can't use a netty bytebuf pool? (Direct buffer
allocation is slow)
{code}
"TestClient-3" #100 prio=5 os_prio=0 tid=0x00007f2892193800 nid=0xe3d runnable
[0x00007f286ad37000]
java.lang.Thread.State: RUNNABLE
at sun.misc.Unsafe.allocateMemory(Native Method)
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:127)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
at
io.netty.buffer.UnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledUnsafeDirectByteBuf.java:108)
at
io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:69)
at
io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:50)
at
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
at
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146)
at
org.apache.hadoop.hbase.ipc.AsyncRpcChannel.writeRequest(AsyncRpcChannel.java:427)
at
org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethod(AsyncRpcChannel.java:323)
at
org.apache.hadoop.hbase.ipc.AsyncRpcChannel.callMethodWithPromise(AsyncRpcChannel.java:345)
at
org.apache.hadoop.hbase.ipc.AsyncRpcClient.call(AsyncRpcClient.java:155)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:30860)
at org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:873)
at org.apache.hadoop.hbase.client.HTable$4.call(HTable.java:864)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
at org.apache.hadoop.hbase.client.HTable.get(HTable.java:881)
at
org.apache.hadoop.hbase.PerformanceEvaluation$RandomReadTest.testRow(PerformanceEvaluation.java:1253)
at
org.apache.hadoop.hbase.PerformanceEvaluation$Test.testTimed(PerformanceEvaluation.java:1039)
at
org.apache.hadoop.hbase.PerformanceEvaluation$Test.test(PerformanceEvaluation.java:1021)
at
org.apache.hadoop.hbase.PerformanceEvaluation.runOneClient(PerformanceEvaluation.java:1515)
at
org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:408)
at
org.apache.hadoop.hbase.PerformanceEvaluation$1.call(PerformanceEvaluation.java:403)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}
I wrote 100M rows of various cell sizes between 0 and 8k bytes. This made for
about 30 regions. I hosted all 30 on one server and then on another I ran 5
clients in the one process as follows:
export HADOOP_CLASSPATH="$HOME/conf_hbase:`/home/stack/hbase/bin/hbase
classpath`"
perf stat ${HOME}/hadoop/bin/hadoop --config ${HOME}/conf_hadoop
org.apache.hadoop.hbase.PerformanceEvaluation --nomapred --rows=1000000
randomRead 5 &> /tmp/$1.log
Nearly all requests are served from memory. Async client may do slightly less
gc.
Here are summary perf stats for the two client runs... they are close enough:
NOT-async:
{code}
695310.062208 task-clock # 1.487 CPUs utilized
15,534,081 context-switches # 0.022 M/sec
810,178 CPU-migrations # 0.001 M/sec
84,042 page-faults # 0.121 K/sec
1,092,725,117,827 cycles # 1.572 GHz
[83.35%]
884,126,540,362 stalled-cycles-frontend # 80.91% frontend cycles idle
[83.30%]
620,320,998,946 stalled-cycles-backend # 56.77% backend cycles idle
[66.75%]
375,414,284,865 instructions # 0.34 insns per cycle
# 2.36 stalled cycles per insn
[83.43%]
70,798,813,916 branches # 101.823 M/sec
[83.29%]
3,979,009,879 branch-misses # 5.62% of all branches
[83.31%]
467.591170612 seconds time elapsed
{code}
ASYNC
{code}
820257.269023 task-clock # 1.496 CPUs utilized
15,698,021 context-switches # 0.019 M/sec
810,751 CPU-migrations # 0.988 K/sec
104,956 page-faults # 0.128 K/sec
1,283,426,193,745 cycles # 1.565 GHz
[83.40%]
1,028,791,974,070 stalled-cycles-frontend # 80.16% frontend cycles idle
[83.31%]
716,493,564,615 stalled-cycles-backend # 55.83% backend cycles idle
[66.65%]
464,884,412,385 instructions # 0.36 insns per cycle
# 2.21 stalled cycles per insn
[83.32%]
88,796,998,635 branches # 108.255 M/sec
[83.36%]
4,714,422,320 branch-misses # 5.31% of all branches
[83.29%]
548.454008573 seconds time elapsed
{code}
> Add new AsyncRpcClient
> ----------------------
>
> Key: HBASE-12684
> URL: https://issues.apache.org/jira/browse/HBASE-12684
> Project: HBase
> Issue Type: Improvement
> Components: Client
> Reporter: Jurriaan Mous
> Assignee: Jurriaan Mous
> Attachments: HBASE-12684-DEBUG2.patch, HBASE-12684-DEBUG3.patch,
> HBASE-12684-v1.patch, HBASE-12684-v10.patch, HBASE-12684-v11.patch,
> HBASE-12684-v12.patch, HBASE-12684-v13.patch, HBASE-12684-v14.patch,
> HBASE-12684-v15.patch, HBASE-12684-v16.patch, HBASE-12684-v17.patch,
> HBASE-12684-v17.patch, HBASE-12684-v18.patch, HBASE-12684-v19.1.patch,
> HBASE-12684-v19.patch, HBASE-12684-v19.patch, HBASE-12684-v2.patch,
> HBASE-12684-v3.patch, HBASE-12684-v4.patch, HBASE-12684-v5.patch,
> HBASE-12684-v6.patch, HBASE-12684-v7.patch, HBASE-12684-v8.patch,
> HBASE-12684-v9.patch, HBASE-12684.patch
>
>
> With the changes in HBASE-12597 it is possible to add new RpcClients. This
> issue is about adding a new Async RpcClient which would enable HBase to do
> non blocking protobuf service communication.
> Besides delivering a new AsyncRpcClient I would also like to ask the question
> what it would take to replace the current RpcClient? This would enable to
> simplify async code in some next issues.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)