Hi NIFI Team, I am reaching out to you for the information regarding NIFI heap usage. Please find the below details about NIFI and system setup we are using. NIFI version: 1.27.0 Environment details: 14 NIFI nodes cluster, running in Kubernetes where each pod allocated with 30 GB disk.
use case: * We are having 14 NIFI nodes cluster, where the NIFI reads the data from local kafka and compress the data using NIFI compressor (using compressor technique “snappy”). * After compressing sending the data to central NIFI using Remote Processor Group (RPG) through site to site communication. * We have 2 million records in kafka to be processed in 5 minutes. * The raw data would be around 15GB, after compressing the raw data using NIFI compressor (using compressor technique “snappy”) the data becomes 3GB which is sent to central NIFI using site to site communication over https. * Allocated 20 GB of heap to each NIFI node in 14 NIFI nodes cluster. Observations: * Observed JVM in each NIFI is taking 40GB but allocated heap is 20GB. * Observed thousands of I/O dispatcher threads are getting created. Example: please find one of the thread dump below "I/O dispatcher 18224513" #19393711 prio=5 os_prio=0 cpu=3.71ms elapsed=63.10s tid=0x00007f63b809e960 nid=0x287145 runnable [0x00007f5ed3a22000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPoll.wait(java.base@17.0.12/Native Method) at sun.nio.ch.EPollSelectorImpl.doSelect(java.base@17.0.12/EPollSelectorImpl.java:118) at sun.nio.ch.SelectorImpl.lockAndDoSelect(java.base@17.0.12/SelectorImpl.java:129) - locked <0x0000000504af5698> (a sun.nio.ch.Util$2) - locked <0x0000000504ae01a8> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(java.base@17.0.12/SelectorImpl.java:141) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:255) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) at java.lang.Thread.run(java.base@17.0.12/Thread.java:840) * Enabled native memory tracking and observed below statistics. Native Memory Tracking: (Omitting categories weighting less than 1GB) Total: reserved=39GB, committed=32GB malloc: 11GB #1315791 mmap: reserved=28GB, committed=21GB - Java Heap (reserved=20GB, committed=20GB) (mmap: reserved=20GB, committed=20GB) - Class (reserved=1GB, committed=0GB) (classes #32835) ( instance classes #31117, array classes #1718) (mmap: reserved=1GB, committed=0GB) ( Metadata: ) ( reserved=0GB, committed=0GB) ( used=0GB) ( waste=0GB =0.84%) ( Class space:) ( reserved=1GB, committed=0GB) ( used=0GB) ( waste=0GB =6.92%) - Thread (reserved=6GB, committed=0GB) (thread #5634) (stack: reserved=6GB, committed=0GB) - GC (reserved=1GB, committed=1GB) (mmap: reserved=1GB, committed=1GB) - Other (reserved=11GB, committed=11GB) (malloc=11GB #12622) (peak=11GB #16965) Need help: * It would be very grateful if someone help us understanding why NIFI JVM is using 40GB after allocating 20 GB heap. * Wanted to understand why thousands of I/O dispatcher threads are getting spawn in each NIFI and would like to know why NIFI needs those many threads and why JVM is taking 40GB over 20GB. Apologise if there is any confusion created in the explanation. Need inputs to understand more about NIFI memory usage. Please let us know if there is any other information needed. Thanks & Regards, Dharani Tirumanyam.