Hi William,

When the log entry size is large, it is known that GrpcLogAppender has a
high memory usage.
As you described, the gRPC library allocates a buffer for each
appendEntries call, including retries.
It seems that we don't have any control for this part.

A long term solution is to use Streaming to implement a LogAppender; see
[7].

As a workaround, you may consider using Streaming to write data in your
application.
Then, the log entries contain only metadata.  This is what we are doing in
Apache Ozone [8].
We did get a very good performance -- Streaming is around 2 times faster
than Async for a single client
and 3 times faster for 3 clients; see "20220702_Single Client - Benchmarks
for Ozone Streaming.pdf"
and "20220702_3 Clients - Benchmarks for Ozone Streaming.pdf" in [8].

Ratis Streaming also gives us other benefits such as reducing the workload
of the leader.
For more information, see RatisStreaming20200929.pdf in [9].

Tsz-Wo
[7] https://issues.apache.org/jira/browse/RATIS-1310
[8] https://issues.apache.org/jira/browse/HDDS-4454
[9] https://issues.apache.org/jira/browse/RATIS-979


On Sat, Jul 2, 2022 at 7:21 AM William Song <[email protected]> wrote:

> Thanks for the detailed explanation, now I can say I’m sure that there is
> no direct memory leak in NettyStreamImpl. I suspect DirectOOM we
> encountered days before was caused by real shortage of memory.
>
> The GrpcLogAppender uses Netty as its underlying network channel, and
> Netty will allocate direct ByteBuffer for each AppendLog RPC. When log
> entries each contains megabytes of data & arrives atRaft Leader in high
> throughput, Netty’s PooledDirectByteBuffer’s free rate cannot catch up with
> the allocation rate, which leads to the direct memory usage’s stead
> increase and eventually OOM.
>
> To make things worse, if a group contains 2n+1 members, the leader has to
> send 2n times of AE, which multiplies Direct memory usage by a factor of
> 2n. There will be only 1 copy in Heap but 2n copies in Direct memory.
>
> What can we do to reduce the direct memory usage? It is possible that we
> can let 2n AE RPCs uses the same directByteBuffer?
>
> PS: the call stack of AppendEntry SendRequest
>
> <screenshot removed>
>
> Best Wishes,
> William Song
>
>
>
> 2022年7月1日 05:14,Tsz Wo Sze <[email protected]> 写道:
>
> ...  The corresponding release() is in DataStreamManagement.read(..); see
>
> [2].
>
> Oops, sorry that [2] is for releasing the buf in cases when there is an
> exception.  (I somehow saw the catch-block as a finally-block.  My
> mistake!)  For the normal case, the release is in [6].
>
> 5. Once readImpl(..) has returned, buf is released in [2].
>
>
> This Step #5 described previously is also incorrect.  Even readImpl(..) has
> returned, buf may not be released since the local and remote writes in Step
> #4 are async.  Therefore, it has to wait for all the local and remote
> futures to complete and then release buf in [6].  Note that [6] is inside
> whenComplete(..) in the code which is for all the local and remote futures.
>
> Hope that I did not confuse you too much.
> Tsz-Wo
>
> [6]
>
> https://github.com/apache/ratis/blob/75af736d0d20c44b5d47a458b0c7b33387a75a6d/ratis-netty/src/main/java/org/apache/ratis/netty/server/DataStreamManagement.java#L444
>
> On Sun, Jun 26, 2022 at 10:52 AM Tsz Wo Sze <[email protected]> wrote:
>
> Hi William,
>
> Thanks a lot for checking the code.
>
> retain() is called here, but seems no corresponding release().
>
>
> It is necessary to call retain() since buf will be used after decode(..).
> Otherwise, buf will automatically be released as mentioned in [1].  The
> corresponding release() is in DataStreamManagement.read(..); see [2].
>
> The steps are
> 1. Netty is reading a byte stream
> 2. It calls ByteToMessageDecoder.decode(.., List<Object> out) [3]  to
> decode the byte stream and then adds the decoded objects to the out list.
> 3. In our case, the decoded objects are DataStreamRequestByteBuf and the
> ByteBuf is retained as DataStreamRequestByteBuf.buf.
> 4. Later on, the same DataStreamRequestByteBuf.buf is used for writing
> locally and writing to the remote peers in readImpl(..) [4].
> 5. Once readImpl(..) has returned, buf is released in [2].
>
> If there is a suspicion of memory leak, could you try
> -Dio.netty.leakDetection.level=ADVANCED [5] ?
>
> Regards,
> Tsz-Wo
>
>
> [1]
> https://stackoverflow.com/questions/52883044/netty-4-0-bytebuf-retain-use
> [2]
>
> https://github.com/apache/ratis/blob/75af736d0d20c44b5d47a458b0c7b33387a75a6d/ratis-netty/src/main/java/org/apache/ratis/netty/server/DataStreamManagement.java#L385
> [3]
>
> https://github.com/apache/ratis/blob/75af736d0d20c44b5d47a458b0c7b33387a75a6d/ratis-netty/src/main/java/org/apache/ratis/netty/server/NettyServerStreamRpc.java#L243
> [4]
>
> https://github.com/apache/ratis/blob/75af736d0d20c44b5d47a458b0c7b33387a75a6d/ratis-netty/src/main/java/org/apache/ratis/netty/server/DataStreamManagement.java#L390
> [5]
> https://stackoverflow.com/questions/43676085/how-to-test-netty-memory-leak
>
> On Sun, Jun 26, 2022 at 5:00 AM William Song <[email protected]> wrote:
>
> Hi,
>
> We discovered Direct Memory OOM after running program with Ratis for a
> long time. Through heap dump, we find DirectByteBuffers holded by
> ratis.thirdparty.netty are increasing.
>
> I discovered that in NettyServerStreamRPC.java, newChannelInitializer,
> newDecoder, decodeDataStreamRequestByteBuf
> static DataStreamRequestByteBuf decodeDataStreamRequestByteBuf(ByteBuf
> buf) {
>  return Optional.ofNullable(decodeDataStreamRequestHeader(buf))
>      .map(header -> checkHeader(header, buf))
>      .map(header -> new DataStreamRequestByteBuf(header, decodeData(buf,
> header, ByteBuf::retain)))
>      .orElse(null);
> }
>
> retain() is called here, but seems no corresponding release(). Since
> retain() will add buf's (which is DirectByteBuffer) referenced count, so
> Cleaner cannot GC the buf and recycle its corresponding resources. I assume
> that this might be the cause of direct memory OOM error, but I’m new to
> Netty and not certain about it, so could anyone please take a look at it?
> Thanks in advance.
>
> Best Wishes,
> William Song
>
>
>
>
>
>

Reply via email to