[
https://issues.apache.org/jira/browse/HBASE-15756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-15756:
--------------------------
Attachment: gc.png
gets.png
Here are a few graphs [~aoxiang]
As is, we are slower ... 120k vs ~105k with the patch as is. Looking at thread
dump, the nioEventLoopGroup for workers is spinning up lots of threads.... 40
or 50? Trying to do apples to apples, again I put a bound on threads created
making netty worker count == readers count. When I do this, I get closer...
120k vs 115k or so? I then set hbase.rpc.server.nativetransport to true so we
use the alternative epoll and then I get almost the same: 120k vs ~199k.
Looking at the thread stack, I see this:
3683 "epollEventLoopGroup-3-5" #40 prio=10 os_prio=0 tid=0x000000000260f520
nid=0xe09b runnable [0x00007f5ac437e000]
3684 java.lang.Thread.State: RUNNABLE
3685 at sun.misc.Cleaner.add(Cleaner.java:79)
3686 - locked <0x00007f5bde303070> (a java.lang.Class for sun.misc.Cleaner)
3687 at sun.misc.Cleaner.create(Cleaner.java:133)
3688 at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:139)
3689 at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
3690 at
io.netty.buffer.UnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledUnsafeDirectByteBuf.java:108)
3691 at
io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:69)
3692 at
io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:50)
3693 at
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
3694 at
io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:146)
3695 at
io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:107)
3696 at
io.netty.channel.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104)
3697 at
io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:712)
3698 at
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326)
3699 at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264)
3700 at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
3701 at
io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
3702 at java.lang.Thread.run(Thread.java:745)
We seem to be doing direct allocations each time. That'll slow us down (also
explains a slightly higher GC time). I and (@appy) messed around trying to use
a buffer pool enabling this...
bootstrap.option(ChannelOption.ALLOCATOR, PooledByteBufAllocator.DEFAULT);
... and messing in code but our server hangs. I can mess more but thought I'd
ask you first since you are probably coming on line now why you had the above
commented out.
Hopefully we can put our own allocator in here... one that does
[~anoop.hbase]'s fixed size pool of buffers... hmmmm... or this might take some
work... We'll see. Anyways, any thoughts on above [~aoxiang] appreciated. If we
can make netty as fast -- or faster -- and we can make it so it plays nicely
with the offheaping of the write path, lets slot it in. Thanks.
> Pluggable RpcServer
> -------------------
>
> Key: HBASE-15756
> URL: https://issues.apache.org/jira/browse/HBASE-15756
> Project: HBase
> Issue Type: Improvement
> Components: Performance, rpc
> Reporter: binlijin
> Assignee: binlijin
> Priority: Critical
> Attachments: Netty4RpcServer_forperf.patch, NettyRpcServer.patch,
> NettyRpcServer_forperf.patch, gc.png, gets.png, gets.png, idle.png, queue.png
>
>
> Current we use a simple RpcServer, and can not configure and use other
> implementation.This issue is to make the RpcServer pluggable, so we can make
> other implementation for example netty rpc server. Patch will upload laterly
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)