Hi William,

According to this comment
https://github.com/grpc/grpc-java/issues/9340#issuecomment-1185995690 ,
they will have a fix in 1.48.1 soon.

Tsz-Wo


On Wed, Jul 20, 2022 at 7:43 PM William Song <[email protected]> wrote:

> Hi Tsz-Wo,
>
> It indeed looks like the same problem. I’ll add
> netty.leakDetectionLevel=paranoid to see if I can obtain more information.
>
> William
>
> > 2022年7月21日 01:13,Tsz Wo Sze <[email protected]> 写道:
> >
> > Hi William,
> >
> > Indeed, there is a recent gRPC "ByteBuffer memory leak in retry
> mechanism"
> > issue; see https://github.com/grpc/grpc-java/issues/9340 .  Not sure if
> it
> > is the same problem you saw.
> >
> > Tsz-Wo
> >
> >
> > On Tue, Jul 19, 2022 at 6:13 PM Tsz Wo Sze <[email protected]> wrote:
> >
> >> Hi William,
> >>
> >>> ... We use gRPC as their underlying communication channel. ...
> >>
> >> I searched the source code of IoTDB.  IoTDB uses neither the Ratis
> >> Streaming API nor anything in org.apache.ratis.thirdparty.io.netty.
> >> Therefore, the leak seems to be from the gRPC library.
> >>
> >> Tsz-Wo
> >>
> >>
> >> On Tue, Jul 19, 2022 at 1:22 AM William Song <[email protected]>
> wrote:
> >>
> >>> Hi Tsz-Wo,
> >>>
> >>> We set up a cluster of IoTDB Datanodes, which consititude a Raft Group
> >>> with 3 members, and have 3 clients writing data to these 3 servers
> >>> respectively.  We use gRPC as their underlying communication channel.
> After
> >>> 48h of running, the 3 clients writes about 100GB data. Worth to
> notice, 1
> >>> server is particularly slow and is about 2000 logs behind. In this slow
> >>> server we discovered the direct memory OOM error. This happens
> occasionally
> >>> and is not deterministic.
> >>>
> >>> William
> >>>
> >>>
> >>>
> >>>> 2022年7月19日 00:51,Tsz Wo Sze <[email protected]> 写道:
> >>>>
> >>>> Hi William,
> >>>>
> >>>> It does look like a leak.  Could you provide the steps for reproducing
> >>> it?
> >>>>
> >>>> Tsz-Wo
> >>>>
> >>>>
> >>>> On Mon, Jul 18, 2022 at 8:41 AM William Song <[email protected]
> >>> <mailto:[email protected]>> wrote:
> >>>> Hi,
> >>>>
> >>>> We discovered an error log from
> >>> org.apache.ratis.thirdparty.io.netty.utils.ResourceLeakDetector saying
> >>> ByteBuf.release() is not called before it’s garbage-collected. The
> >>> following is the error log screenshot. We encountered direct memory OOM
> >>> several times when running Ratis for a long time, so we assume this
> message
> >>> may have something to do with the direct memory OOM problem.
> >>>>
> >>>> Could anyone please take a look and check wether there is a memory
> >>> leak? Thanks in advance!
> >>>>
> >>>> Best Wishes,
> >>>> William
> >>>
> >>>
>
>

Reply via email to