This does help, thank you Matt. And I like your suggestion. It would be more at our fingertips if as we hover over the thread count on the processor, the distribution across all cluster nodes is presented in a popup. I wonder if project leads would consider this helpful improvement?
I can now see that my hanging threads are on just two of my cluster nodes. This is very helpful - thanks again. It reduces the amount of thread dumping review I will be doing today. Jim On Wed, Jun 24, 2020 at 9:53 PM Matt Gilman <matt.c.gil...@gmail.com> wrote: > Hi Jim, > > If you open the Summary page from the global menu you should see the > active threads in parentheses next to the scheduled state. Find the row in > question and click the cluster icon from the actions column. This will open > a dialog with a node-wise breakdown. I believe that the thread count is one > of the metrics that is broken down per node. > > Hope this helps! Adding this breakdown to the main canvas would be a great > addition. Maybe these breakdowns could be offered in a tooltip first each > metric. > > Matt > > Sent from my iPhone > > > On Jun 24, 2020, at 21:05, James McMahon <jsmcmah...@gmail.com> wrote: > > > > > > Our production nifi cluster is exhibiting repeated problems with threads > that do not end. It is happening with processors that have complex > configurations and dependencies (ConsumeAMQP), and - more troubling - it is > also occurring periodically for simple processors like ControlRate. I’ll > have a Control processor sitting in a running state with no active running > thread,I select Stop on that processor, get a thread I presume to be > responsible for stopping the processor, and that thread will never end. > This renders my processor in a useless state - not stopped, not really > running, and not accessible to reconfigure. > > > > I read a blog by Pierre Villard on using nifi.sh for thread dumps. I’ll > dig into that. My questions: > > > > 1. In a cluster, is there anything I can use in the UI to tell me which > cluster node hosts the bad thread? Digging through thread dumps from > multiple cluster nodes seems impractical, and I’m hoping there’s a way to > zero in on a node. > > > > 2. What nifi system resources in my configuration influence the > management and well-being of these threads? > > > > 3. Has anyone debugged such a thread issue in a clustered nifi > environment, and if so can you offer any tips based on your experience? > > > > Thanks in advance for any help. > > Jim >