Hello,

Can you take a couple of thread dumps while this is happening and provide
them so we can take a look?

You can put a file name as the argument to nifi.sh dump to have it written
to a file.

Thanks,

Bryan

On Wed, Jan 24, 2018 at 6:48 AM we are <[email protected]> wrote:

> Hi,
>
> Recently we switched the server we run nifi on from a 24 core server to a 4
> core one, and since then approximately 4 times a day nifi stops responding
> until it is restarted . Then we switched to an 8 cores server, and now it
> happens approximately every 2 days.
>
> When this happens, the UI becomes unresponsive, as well as the rest api.
> The number of nifi active threads metric returns 0 active threads, and the
> cpu is at 100% idle. There is not large spike in flowfiles, memory or cpu
> usage before nifi stops responding. But, when we checked the provenance
> repo we saw that events were getting created. The logs only show that
> events are being created, there are no errors or warnings. By looking into
> the content of the events we were able to determine that events were
> flowing up until a processor using the RedisConnectionPoolService.
>
> We tried to connect with the debugger to different processors and all of
> them, except 4, responded and the debugger connected successfully.
> The other 4 are using the RedisConnectionPoolService, and they didn't
> respond. 2 of these processors are custom ones we wrote, the other 2 are
> the built in wait-notify mechanism. When we tried to connect to the
> RedisConnectionPoolService the debugger wasn't able to connect to it as
> well. The redis service that the connection pool is connected to responds
> to us normally.
>
> We tried to look at the active threads using /opt/nifi/bin/nifi.sh dump,
> but we did not see anything strange.
>
> When we tried to dig into the problem we noticed that nifi uses an old
> version of spring-data-redis. We don't know if this is the problem but we
> opened an issue for this: https://issues.apache.org/jira/browse/NIFI-4811u
>
> The maximum timer driven thread count is the default (10). Our custom
> processors are configured to a maximum of 10 concurrent tasks, and the
> wait/notify processors are configured to 5. The RedisConnectionPoolService
> is configured with the default values:
> Max Total: 20
> Max Idle: 8
> Min Idle: 0
> Block When Exhausted: true
> Max Evictable Idle Time: 60 seconds
> Time Between Eviction Runs: 30 seconds
> Num Tests Per Eviction Run: -1
>
> We made sure to always call connection.close() in our custom made
> processors.
> Is it possible that somehow connections are not released or evicted, and
> that is why nifi freezes like this? How can we determine that this is the
> case?
>
> Thanks!
> Daniel
>
-- 
Sent from Gmail Mobile

Reply via email to