Hi William,

Thanks a lot for checking the code.

> retain() is called here, but seems no corresponding release().

It is necessary to call retain() since buf will be used after decode(..).
Otherwise, buf will automatically be released as mentioned in [1].  The
corresponding release() is in DataStreamManagement.read(..); see [2].

The steps are
1. Netty is reading a byte stream
2. It calls ByteToMessageDecoder.decode(.., List<Object> out) [3]  to
decode the byte stream and then adds the decoded objects to the out list.
3. In our case, the decoded objects are DataStreamRequestByteBuf and the
ByteBuf is retained as DataStreamRequestByteBuf.buf.
4. Later on, the same DataStreamRequestByteBuf.buf is used for writing
locally and writing to the remote peers in readImpl(..) [4].
5. Once readImpl(..) has returned, buf is released in [2].

If there is a suspicion of memory leak, could you try
-Dio.netty.leakDetection.level=ADVANCED [5] ?

Regards,
Tsz-Wo


[1]
https://stackoverflow.com/questions/52883044/netty-4-0-bytebuf-retain-use
[2]
https://github.com/apache/ratis/blob/75af736d0d20c44b5d47a458b0c7b33387a75a6d/ratis-netty/src/main/java/org/apache/ratis/netty/server/DataStreamManagement.java#L385
[3]
https://github.com/apache/ratis/blob/75af736d0d20c44b5d47a458b0c7b33387a75a6d/ratis-netty/src/main/java/org/apache/ratis/netty/server/NettyServerStreamRpc.java#L243
[4]
https://github.com/apache/ratis/blob/75af736d0d20c44b5d47a458b0c7b33387a75a6d/ratis-netty/src/main/java/org/apache/ratis/netty/server/DataStreamManagement.java#L390
[5]
https://stackoverflow.com/questions/43676085/how-to-test-netty-memory-leak

On Sun, Jun 26, 2022 at 5:00 AM William Song <[email protected]> wrote:

> Hi,
>
> We discovered Direct Memory OOM after running program with Ratis for a
> long time. Through heap dump, we find DirectByteBuffers holded by
> ratis.thirdparty.netty are increasing.
>
>  I discovered that in NettyServerStreamRPC.java, newChannelInitializer,
> newDecoder, decodeDataStreamRequestByteBuf
> static DataStreamRequestByteBuf decodeDataStreamRequestByteBuf(ByteBuf
> buf) {
>   return Optional.ofNullable(decodeDataStreamRequestHeader(buf))
>       .map(header -> checkHeader(header, buf))
>       .map(header -> new DataStreamRequestByteBuf(header, decodeData(buf,
> header, ByteBuf::retain)))
>       .orElse(null);
> }
>
> retain() is called here, but seems no corresponding release(). Since
> retain() will add buf's (which is DirectByteBuffer) referenced count, so
> Cleaner cannot GC the buf and recycle its corresponding resources. I assume
> that this might be the cause of direct memory OOM error, but I’m new to
> Netty and not certain about it, so could anyone please take a look at it?
> Thanks in advance.
>
> Best Wishes,
> William Song
>
>
>
>

Reply via email to