Hi, We are very sorry, but we were not able to receive permission to send our thread dump. Is there other information that may be helpful? Could you give us tips as to how to read this document, and what it is supposed to contain under normal circumstances?
Thanks On Wed, Jan 24, 2018 at 4:19 PM, Bryan Bende <[email protected]> wrote: > Hello, > > Can you take a couple of thread dumps while this is happening and provide > them so we can take a look? > > You can put a file name as the argument to nifi.sh dump to have it written > to a file. > > Thanks, > > Bryan > > On Wed, Jan 24, 2018 at 6:48 AM we are <[email protected]> wrote: > > > Hi, > > > > Recently we switched the server we run nifi on from a 24 core server to > a 4 > > core one, and since then approximately 4 times a day nifi stops > responding > > until it is restarted . Then we switched to an 8 cores server, and now it > > happens approximately every 2 days. > > > > When this happens, the UI becomes unresponsive, as well as the rest api. > > The number of nifi active threads metric returns 0 active threads, and > the > > cpu is at 100% idle. There is not large spike in flowfiles, memory or cpu > > usage before nifi stops responding. But, when we checked the provenance > > repo we saw that events were getting created. The logs only show that > > events are being created, there are no errors or warnings. By looking > into > > the content of the events we were able to determine that events were > > flowing up until a processor using the RedisConnectionPoolService. > > > > We tried to connect with the debugger to different processors and all of > > them, except 4, responded and the debugger connected successfully. > > The other 4 are using the RedisConnectionPoolService, and they didn't > > respond. 2 of these processors are custom ones we wrote, the other 2 are > > the built in wait-notify mechanism. When we tried to connect to the > > RedisConnectionPoolService the debugger wasn't able to connect to it as > > well. The redis service that the connection pool is connected to responds > > to us normally. > > > > We tried to look at the active threads using /opt/nifi/bin/nifi.sh dump, > > but we did not see anything strange. > > > > When we tried to dig into the problem we noticed that nifi uses an old > > version of spring-data-redis. We don't know if this is the problem but we > > opened an issue for this: https://issues.apache.org/ > jira/browse/NIFI-4811u > > > > The maximum timer driven thread count is the default (10). Our custom > > processors are configured to a maximum of 10 concurrent tasks, and the > > wait/notify processors are configured to 5. The > RedisConnectionPoolService > > is configured with the default values: > > Max Total: 20 > > Max Idle: 8 > > Min Idle: 0 > > Block When Exhausted: true > > Max Evictable Idle Time: 60 seconds > > Time Between Eviction Runs: 30 seconds > > Num Tests Per Eviction Run: -1 > > > > We made sure to always call connection.close() in our custom made > > processors. > > Is it possible that somehow connections are not released or evicted, and > > that is why nifi freezes like this? How can we determine that this is the > > case? > > > > Thanks! > > Daniel > > > -- > Sent from Gmail Mobile >
