Hi Mark,

I've been using the G1 garbage collector.  I brought the nodes down to 8GB
heap and let it run overnight, but processing still got stuck and requiring
NiFi to be restarted on all nodes.  It took longer to happen, but they went
down after a few hours.  Are there any other things I can look into?

Thanks!

-Aaron

On Thu, Jul 14, 2016 at 2:33 PM, Mark Payne <[email protected]> wrote:

> Aaron,
>
> My guess would be that you are hitting a Full Garbage Collection. With
> such a huge Java heap, that will cause a "stop the world" pause for quite a
> long time.
> Which garbage collector are you using? Have you tried reducing the heap
> from 48 GB to say 4 or 8 GB?
>
> Thanks
> -Mark
>
>
> > On Jul 14, 2016, at 11:14 AM, Aaron Longfield <[email protected]>
> wrote:
> >
> > Hi,
> >
> > I'm having an issue with a small (two node) NiFi cluster where the nodes
> will stop processing any queued flowfiles.  I haven't seen any error
> messages logged related to it, and when attempting to restart the service,
> NiFi doesn't respond and the script forcibly kills it.  This causes
> multiple flowfile version to hang around, and generally makes me feel like
> it might be causing data loss.
> >
> > I'm running the web UI on a different box, and when things stop working,
> it stops showing changes to counts in any queues, and the thread count
> never changes.  It still thinks the nodes are connecting and responding,
> though.
> >
> > My environment is two 8 cpu systems w/ 60GB memory with 48GB given to
> the NiFi JVM in bootstrap.conf.  I have timer threads limited to 12, and
> event threads to 4.  Install is on the current Amazon Linux AMI and using
> OpenJDK 1.8.0.91 x64.
> >
> > Any idea, other debug steps, or changes that I can try?  I'm running
> 0.7.0, having upgraded from 0.6.1, but this has been occurring with both
> versions.  The higher the flowfile volume I push through, the faster this
> happens.
> >
> > Thanks for any help there is to give!
> >
> > -Aaron Longfield
>
>

Reply via email to