[
https://issues.apache.org/jira/browse/HBASE-27947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753784#comment-17753784
]
Bryan Beaudreault commented on HBASE-27947:
-------------------------------------------
[~zhangduo] Sorry, I need to do a little more verification around the
composeIntoComposite patch before you spend too much time on it. I started
writing some notes for you and realized that the composeIntoComposite issue
might have been more related to when I was trying to tune the
SslHandler.setWrapDataSize. I am going to dig in more on that during the
workday my time (EST) tomorrow.
Here's the most up-to-date summary on the problem, and my next steps:
The key issue we are seeing is that ChannelOutboundBuffer is unable to flush
fast enough, and eventually builds up to an OOM. We can verify this with
metrics and heap dumps. You'd think if SSL slowness were the issue, we'd see
the buildup in SslHandler's pendingUnencryptedBytes, but we don't. So why is
ChannelOutboundBuffer flushing slower?
Here's what I can say with some certainty, on why SslHandler causes a buildup
in ChannelOutboundBuffer: Looking at AsyncProfiles with SSL enabled and
disabled, when SSL is enabled the AbstractEpollStreamChannel.doWrite spends
_way more_ time doing PoolArena.release. The majority (about 80%) of the
doWrite time is in PoolArena, rather than socket writing. Without SSL enabled,
0% of time is spent in PoolArena.
Here's my theory on why:
* For a 5mb response, our NettyRpcServerResponseEncoder writes a
CompositeByteBuf wrapping 80 64kb ByteBuffs from our own ByteBuffAllocator.
* Without SSL:
** The CompositeByteBuf is added right to the ChannelOutboundBuffer.
** The IO worker pulls from buffer and sends to the socket via
FileDescriptor.writevAddresses.
** *Since the backing ByteBuffs are from our own ByteBuffAllocator, this
process does not involve netty PoolArena*
** Once finished, our own callbacks release the buffers to our reservoir.
* With SSL:
** SslHandler tries to break the 5mb CompositeByteBuf into 16kb chunks, each
allocated from PoolArena (allocateOutNetBuf)
** So the 80 64kb ByteBuffs will be broken down into at least 300 netty
ByteBuffs from the PoolArena
** Those smaller chunks are written onto ChannelOutboundBuffer individually
** When the IO worker pulls from the ChannelOutboundBuffer, instead of pulling
1 CompositeByteBuff it needs to pull 300+ *PooledUnsafeByteBuff from the
PoolArena. This requires releasing those ByteBuff from the PoolArena.*
{*}Summary{*}: Netty's PoolArena might be a highly optimized jemalloc
implementation, but we are going from 0 interactions without SSL to 600+
interactions (allocating and then releasing). Releasing buffers from the
PoolArena takes the majority of the time when flushing the
ChannelOutboundBuffer.
*Next step:*
The majority of wasted time is in PoolArena.release for
AbstractEpollStreamChannel.doWrite, and PoolArena.allocate for SslHandler.wrap.
On the allocation front, it's largely allocateOutNetBuf. It seems like it would
be beneficial to allocate fewer larger buffers. I tried tuning
SslHandler.setWrapDataSize, and this caused copyAndCompose to be called a ton
(hence changing to composeIntoComposite). It also doesn't actually seem to
work, because SSL will only wrap around 16kb at a time, so most of the extra
WrapDataSize ends up getting added back onto the bufferQueue and re-polled next
time.
I'm wondering if we could improve SslHandler.wrap – pull the full response
buffer, and allocate one large outNetBuf, then pass to SSL with offsets/lengths
so we step through the larger buffers without extra allocations. We can
estimate how big of an outNetBuf we need, and possibly allocate one extra if
there's overflow. I think this would be a lot more efficient since it's fewer
operations in PoolArena and the memory is contiguous.
-
> RegionServer OOM under load when TLS is enabled
> -----------------------------------------------
>
> Key: HBASE-27947
> URL: https://issues.apache.org/jira/browse/HBASE-27947
> Project: HBase
> Issue Type: Bug
> Components: rpc
> Affects Versions: 2.6.0
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Critical
>
> We are rolling out the server side TLS settings to all of our QA clusters.
> This has mostly gone fine, except on 1 cluster. Most clusters, including this
> one have a sampled {{nettyDirectMemory}} usage of about 30-100mb. This
> cluster tends to get bursts of traffic, in which case it would typically jump
> to 400-500mb. Again this is sampled, so it could have been higher than that.
> When we enabled SSL on this cluster, we started seeing bursts up to at least
> 4gb. This exceeded our {{{}-XX:MaxDirectMemorySize{}}}, which caused OOM's
> and general chaos on the cluster.
>
> We've gotten it under control a little bit by setting
> {{-Dorg.apache.hbase.thirdparty.io.netty.maxDirectMemory}} and
> {{{}-Dorg.apache.hbase.thirdparty.io.netty.tryReflectionSetAccessible{}}}.
> We've set netty's maxDirectMemory to be approx equal to
> ({{{}-XX:MaxDirectMemorySize - BucketCacheSize - ReservoirSize{}}}). Now we
> are seeing netty's own OutOfDirectMemoryError, which is still causing pain
> for clients but at least insulates the other components of the regionserver.
>
> We're still digging into exactly why this is happening. The cluster clearly
> has a bad access pattern, but it doesn't seem like SSL should increase the
> memory footprint by 5-10x like we're seeing.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)