[jira] [Commented] (HBASE-27947) RegionServer OOM under load when TLS is enabled

Bryan Beaudreault (Jira) Sun, 13 Aug 2023 07:21:04 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-27947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753784#comment-17753784
 ]


Bryan Beaudreault commented on HBASE-27947:
-------------------------------------------

[~zhangduo] Sorry, I need to do a little more verification around the 
composeIntoComposite patch before you spend too much time on it. I started 
writing some notes for you and realized that the composeIntoComposite issue 
might have been more related to when I was trying to tune the 
SslHandler.setWrapDataSize. I am going to dig in more on that during the 
workday my time (EST) tomorrow.

Here's the most up-to-date summary on the problem, and my next steps:

The key issue we are seeing is that ChannelOutboundBuffer is unable to flush 
fast enough, and eventually builds up to an OOM. We can verify this with 
metrics and heap dumps. You'd think if SSL slowness were the issue, we'd see 
the buildup in SslHandler's pendingUnencryptedBytes, but we don't. So why is 
ChannelOutboundBuffer flushing slower?

Here's what I can say with some certainty, on why SslHandler causes a buildup 
in ChannelOutboundBuffer: Looking at AsyncProfiles with SSL enabled and 
disabled, when SSL is enabled the AbstractEpollStreamChannel.doWrite spends 
_way more_ time doing PoolArena.release. The majority (about 80%) of the 
doWrite time is in PoolArena, rather than socket writing. Without SSL enabled, 
0% of time is spent in PoolArena.

Here's my theory on why:
 * For a 5mb response, our NettyRpcServerResponseEncoder writes a 
CompositeByteBuf wrapping 80 64kb ByteBuffs from our own ByteBuffAllocator.
 * Without SSL:
 ** The CompositeByteBuf is added right to the ChannelOutboundBuffer.
 ** The IO worker pulls from buffer and sends to the socket via 
FileDescriptor.writevAddresses. 
 ** *Since the backing ByteBuffs are from our own ByteBuffAllocator, this 
process does not involve netty PoolArena*
 ** Once finished, our own callbacks release the buffers to our reservoir.
 * With SSL:
 ** SslHandler tries to break the 5mb CompositeByteBuf into 16kb chunks, each 
allocated from PoolArena (allocateOutNetBuf)
 ** So the 80 64kb ByteBuffs will be broken down into at least 300 netty 
ByteBuffs from the PoolArena
 ** Those smaller chunks are written onto ChannelOutboundBuffer individually
 ** When the IO worker pulls from the ChannelOutboundBuffer, instead of pulling 
1 CompositeByteBuff it needs to pull 300+ *PooledUnsafeByteBuff from the 
PoolArena. This requires releasing those ByteBuff from the PoolArena.*

{*}Summary{*}: Netty's PoolArena might be a highly optimized jemalloc 
implementation, but we are going from 0 interactions without SSL to 600+ 
interactions (allocating and then releasing). Releasing buffers from the 
PoolArena takes the majority of the time when flushing the 
ChannelOutboundBuffer.

*Next step:* 

The majority of wasted time is in PoolArena.release for 
AbstractEpollStreamChannel.doWrite, and PoolArena.allocate for SslHandler.wrap.

On the allocation front, it's largely allocateOutNetBuf. It seems like it would 
be beneficial to allocate fewer larger buffers. I tried tuning 
SslHandler.setWrapDataSize, and this caused copyAndCompose to be called a ton 
(hence changing to composeIntoComposite). It also doesn't actually seem to 
work, because SSL will only wrap around 16kb at a time, so most of the extra 
WrapDataSize ends up getting added back onto the bufferQueue and re-polled next 
time.

I'm wondering if we could improve SslHandler.wrap – pull the full response 
buffer, and allocate one large outNetBuf, then pass to SSL with offsets/lengths 
so we step through the larger buffers without extra allocations. We can 
estimate how big of an outNetBuf we need, and possibly allocate one extra if 
there's overflow. I think this would be a lot more efficient since it's fewer 
operations in PoolArena and the memory is contiguous.

- 

> RegionServer OOM under load when TLS is enabled
> -----------------------------------------------
>
>                 Key: HBASE-27947
>                 URL: https://issues.apache.org/jira/browse/HBASE-27947
>             Project: HBase
>          Issue Type: Bug
>          Components: rpc
>    Affects Versions: 2.6.0
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Critical
>
> We are rolling out the server side TLS settings to all of our QA clusters. 
> This has mostly gone fine, except on 1 cluster. Most clusters, including this 
> one have a sampled {{nettyDirectMemory}} usage of about 30-100mb. This 
> cluster tends to get bursts of traffic, in which case it would typically jump 
> to 400-500mb. Again this is sampled, so it could have been higher than that. 
> When we enabled SSL on this cluster, we started seeing bursts up to at least 
> 4gb. This exceeded our {{{}-XX:MaxDirectMemorySize{}}}, which caused OOM's 
> and general chaos on the cluster.
>  
> We've gotten it under control a little bit by setting 
> {{-Dorg.apache.hbase.thirdparty.io.netty.maxDirectMemory}} and 
> {{{}-Dorg.apache.hbase.thirdparty.io.netty.tryReflectionSetAccessible{}}}. 
> We've set netty's maxDirectMemory to be approx equal to 
> ({{{}-XX:MaxDirectMemorySize - BucketCacheSize - ReservoirSize{}}}). Now we 
> are seeing netty's own OutOfDirectMemoryError, which is still causing pain 
> for clients but at least insulates the other components of the regionserver.
>  
> We're still digging into exactly why this is happening. The cluster clearly 
> has a bad access pattern, but it doesn't seem like SSL should increase the 
> memory footprint by 5-10x like we're seeing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-27947) RegionServer OOM under load when TLS is enabled

Reply via email to