Re: Regarding NIFI Memory usage

Matthew Hawkins Thu, 08 May 2025 18:29:46 -0700

Hi Dharani,

According to Oracle's documentation, the JVM uses more memory than just the
heap.


https://docs.oracle.com/cd/E13150_01/jrockit_jvm/jrockit/geninfo/diagnos/garbage_collect.html

It's been a rule of thumb to have no less than twice as much ram as your
-Xmx needs plus whatever your OS requires. Since it's wholly dependant on
the app running in the JVM, and with Nifi that's totally variable since
it's tied to your specific workflow, you have to do the maths yourself. I
recommend using a JVM analysis tool, you'll find them sold by JRE vendors.

Also, if you don't use records based processing, it's possible to configure
workflows that are going to eat ram both quickly and voluminously as Nifi
has to keep each modified doc copy around, I've done Json manipulation
stuff with a quick & dirty that's laughed at 64Gb and crashed OOM. Knowing
what is going into the Nifi instance and how it's being treated at each
stage is critical to building a final solution that is optimal.

Kr,

On Thu, 8 May 2025, 23:30 Mike Thomsen, <mikerthom...@gmail.com> wrote:

> Can you clarify what your Xmx and Xms settings are?
>
> On Wed, May 7, 2025 at 11:12 AM Tirumanyam, Dharani
> <dtiru...@blueplanet.com.invalid> wrote:
>
> > Hi NIFI Team,
> >
> > I am reaching out to you for the information regarding NIFI heap usage.
> > Please find the below details about NIFI and system setup we are using.
> > NIFI version: 1.27.0
> > Environment details: 14 NIFI nodes cluster, running in Kubernetes where
> > each pod allocated with 30 GB disk.
> >
> > use case:
> > * We are having 14 NIFI nodes cluster, where the NIFI reads the data from
> > local kafka and compress the data using NIFI compressor (using compressor
> > technique “snappy”).
> > * After compressing sending the data to central NIFI using Remote
> > Processor Group (RPG) through site to site communication.
> > * We have 2 million records in kafka to be processed in 5 minutes.
> > * The raw data would be around 15GB, after compressing the raw data using
> > NIFI compressor (using compressor technique “snappy”) the data becomes
> 3GB
> > which is sent to central NIFI using site to site communication over
> https.
> > * Allocated 20 GB of heap to each NIFI node in 14 NIFI nodes cluster.
> >
> > Observations:
> > * Observed JVM in each NIFI is taking 40GB but allocated heap is 20GB.
> > * Observed thousands of I/O dispatcher threads are getting created.
> > Example:
> > please find one of the thread dump below
> >
> >
> > "I/O dispatcher 18224513" #19393711 prio=5 os_prio=0 cpu=3.71ms
> > elapsed=63.10s tid=0x00007f63b809e960 nid=0x287145 runnable
> > [0x00007f5ed3a22000]
> >
> >    java.lang.Thread.State: RUNNABLE
> >
> >         at sun.nio.ch.EPoll.wait(java.base@17.0.12/Native Method)
> >
> >         at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17.0.12
> > /EPollSelectorImpl.java:118)
> >
> >         at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17.0.12
> > /SelectorImpl.java:129)
> >
> >         - locked <0x0000000504af5698> (a sun.nio.ch.Util$2)
> >
> >         - locked <0x0000000504ae01a8> (a sun.nio.ch.EPollSelectorImpl)
> >
> >         at sun.nio.ch.SelectorImpl.select(java.base@17.0.12
> > /SelectorImpl.java:141)
> >
> >         at
> >
> org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255)
> >
> >         at
> >
> org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
> >
> >         at
> >
> org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
> >
> >         at java.lang.Thread.run(java.base@17.0.12/Thread.java:840)
> >
> >
> >
> >
> > * Enabled native memory tracking and observed below statistics.
> >
> >
> > Native Memory Tracking:
> >
> >
> >
> > (Omitting categories weighting less than 1GB)
> >
> >
> >
> > Total: reserved=39GB, committed=32GB
> >
> >        malloc: 11GB #1315791
> >
> >        mmap:   reserved=28GB, committed=21GB
> >
> >
> >
> > -                 Java Heap (reserved=20GB, committed=20GB)
> >
> >                             (mmap: reserved=20GB, committed=20GB)
> >
> >
> >
> > -                     Class (reserved=1GB, committed=0GB)
> >
> >                             (classes #32835)
> >
> >                             (  instance classes #31117, array classes
> > #1718)
> >
> >                             (mmap: reserved=1GB, committed=0GB)
> >
> >                             (  Metadata:   )
> >
> >                             (    reserved=0GB, committed=0GB)
> >
> >                             (    used=0GB)
> >
> >                             (    waste=0GB =0.84%)
> >
> >                             (  Class space:)
> >
> >                             (    reserved=1GB, committed=0GB)
> >
> >                             (    used=0GB)
> >
> >                             (    waste=0GB =6.92%)
> >
> >
> >
> > -                    Thread (reserved=6GB, committed=0GB)
> >
> >                             (thread #5634)
> >
> >                             (stack: reserved=6GB, committed=0GB)
> >
> >
> >
> > -                        GC (reserved=1GB, committed=1GB)
> >
> >                             (mmap: reserved=1GB, committed=1GB)
> >
> >
> >
> > -                     Other (reserved=11GB, committed=11GB)
> >
> >                             (malloc=11GB #12622) (peak=11GB #16965)
> >
> >
> >
> > Need help:
> > * It would be very grateful if someone help us understanding why NIFI JVM
> > is using 40GB after allocating 20 GB heap.
> > * Wanted to understand why thousands of I/O dispatcher threads are
> getting
> > spawn in each NIFI and would like to know why NIFI needs those many
> threads
> > and why JVM is taking 40GB over 20GB.
> >
> >
> > Apologise if there is any confusion created in the explanation. Need
> > inputs to understand more about NIFI memory usage.
> > Please let us know if there is any other information needed.
> >
> >
> > Thanks & Regards,
> > Dharani Tirumanyam.
> >
>

Re: Regarding NIFI Memory usage

Reply via email to