Hi William, > ... We use gRPC as their underlying communication channel. ...
I searched the source code of IoTDB. IoTDB uses neither the Ratis Streaming API nor anything in org.apache.ratis.thirdparty.io.netty. Therefore, the leak seems to be from the gRPC library. Tsz-Wo On Tue, Jul 19, 2022 at 1:22 AM William Song <[email protected]> wrote: > Hi Tsz-Wo, > > We set up a cluster of IoTDB Datanodes, which consititude a Raft Group > with 3 members, and have 3 clients writing data to these 3 servers > respectively. We use gRPC as their underlying communication channel. After > 48h of running, the 3 clients writes about 100GB data. Worth to notice, 1 > server is particularly slow and is about 2000 logs behind. In this slow > server we discovered the direct memory OOM error. This happens occasionally > and is not deterministic. > > William > > > > > 2022年7月19日 00:51,Tsz Wo Sze <[email protected]> 写道: > > > > Hi William, > > > > It does look like a leak. Could you provide the steps for reproducing > it? > > > > Tsz-Wo > > > > > > On Mon, Jul 18, 2022 at 8:41 AM William Song <[email protected] > <mailto:[email protected]>> wrote: > > Hi, > > > > We discovered an error log from > org.apache.ratis.thirdparty.io.netty.utils.ResourceLeakDetector saying > ByteBuf.release() is not called before it’s garbage-collected. The > following is the error log screenshot. We encountered direct memory OOM > several times when running Ratis for a long time, so we assume this message > may have something to do with the direct memory OOM problem. > > > > Could anyone please take a look and check wether there is a memory leak? > Thanks in advance! > > > > Best Wishes, > > William > >
