ArunV Pod memory is one thing. Think of that sort of like RAM to a vm. There is then the heap size for the JVM which is far more directly relevant. In general it requires fairly good understanding of RAM/Heap/various OS memory usage aspects to dial these in correctly.
When you increase the heap and experience better behavior for longer but then the same behavior - this is a strong signal to a memory leak and I'd suggest to start by focusing in the code mentioned and the class member variables. Thanks On Thu, Oct 10, 2024 at 7:51 AM Varadarajan, Arun < arun.varadara...@paramount.com> wrote: > Hi , thanks for your response. From the metrics, we see the POD memory is > always less than what we have set as limit. And increasing the memory > doesn’t help, though we see less GC over head errors, after increasing the > memory. > > > > > > *From:* Joe Witt <joe.w...@gmail.com> > *Sent:* Thursday, October 10, 2024 8:17 PM > *To:* dev@nifi.apache.org > *Subject:* Re: NIFI processor getting hanged > > > > <External Email> > > > > Arun V > > > > This is a strong indication to review any custom components or processors > in the flow which can consume large quantities of memory. > > > > OutOfMemory errors in a general sense indicate either a single large thing > caused the heap to be exhausted or you see errors like this about overhead > limit exceeded which indicates another type which is there is simply too > much retained information in the heap. > > > > Look at your code for the CableLabsDataMappingProcessor for instance. > Does it have any class member variables that hold stuff like any kind of > maps/caches/etc.. If so I'd start there. This sounds, with limited > information, like a memory leak. When you increased the size of the memory > did the flow stay online longer than before by some time interval? Point > is increasing memory buys time but does not solve heap exhaustion in the > event of a memory leak. > > > > Thanks > > > > On Thu, Oct 10, 2024 at 5:51 AM Varadarajan, Arun < > arun.varadara...@paramount.com> wrote: > > > Hi Team, > > We see our NIFI workflow getting hanged intermittently once / twice per > week. After which , even if we restart the processor [say ConsumeAMQP] , > this is not consuming the messages from the RMQ [though the queue had > messages]. > > After restarting the POD [in K8] , the workflow functions as expected. > Please note , we had already increased the POD memory and the issue is > still occurring. > > Also we are observing the below error for some time , and want to check if > this is the cause and if so , what could be the fix. > > CableLabsDataMappingProcessor[id=8a6f5d24-1fa5-38c6-1748-d1e9fe54ee93] > failed to process session due to java.lang.OutOfMemoryError: GC overhead > limit exceeded; Processor Administratively Yielded for 1 sec: > java.lang.OutOfMemoryError: GC overhead limit exceeded > java.lang.OutOfMemoryError: GC overhead limit exceeded > 2024-10-09 00:51:45,229 WARN [Timer-Driven Process Thread-1] > o.a.n.controller.tasks.ConnectableTask Administratively Yielding > CableLabsDataMappingProcessor[id=8a6f5d24-1fa5-38c6-1748-d1e9fe54ee93] due > to uncaught Exception: java.lang.OutOfMemoryError: GC overhead limit > exceeded > java.lang.OutOfMemoryError: GC overhead limit exceeded > > Any help would be greatly appreciated. > > Regards, > Arun V > >