Team, We are running a 3 node nifi cluster in docker. Version is 1.8. Everything has been running smoothly from the last 2-3 months. But from last 2 days, we are observing a couple of issues and those are explained below:
- *Load Balanced Client Thread throwing exceptions continuously.* Nifi logs are filled with below errors and making the disk full very soon. Stacktrace: 2019-05-09 09:06:26,166 ERROR [Load-Balanced Client Thread-1] o.a.n.c.queue.SwappablePriorityQueue Updated Size of Queue Unacknowledged from FlowFile Queue Size[ ActiveQueue=[0, 0 Bytes], Swap Queue=[0, 0 Bytes], Swap Files=[0], Unacknowledged=[-484, -50212519 Bytes] ] to FlowFile Queue Size[ ActiveQueue=[0, 0 Bytes], Swap Queue=[0, 0 Bytes], Swap Files=[0], Unacknowledged=[-484, -50212519 Bytes] ] java.lang.RuntimeException: Cannot create negative queue size at org.apache.nifi.controller.queue.SwappablePriorityQueue.logIfNegative(SwappablePriorityQueue.java:925) at org.apache.nifi.controller.queue.SwappablePriorityQueue.incrementUnacknowledgedQueueSize(SwappablePriorityQueue.java:915) at org.apache.nifi.controller.queue.SwappablePriorityQueue.acknowledge(SwappablePriorityQueue.java:417) at org.apache.nifi.controller.queue.clustered.partition.RemoteQueuePartition$2.onTransactionComplete(RemoteQueuePartition.java:210) at org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClient.communicate(NioAsyncLoadBalanceClient.java:259) at org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClientTask.run(NioAsyncLoadBalanceClientTask.java:76) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) - *Flow files are stuck in queue.* Probably its this issue - NIFI-5919 <https://issues.apache.org/jira/browse/NIFI-5919> which is fixed in 1.9. The flow files are stuck in the queue(Load balance by attribute) and are not read by the next downstream processor(MergeRecord with CSVReader and CSVRecordSetWriter). From the Nifi UI, it appears that flow files are in the queue but when tried to list queue it says "Queue has no flow files". Attempting to empty queue also gives the exact message. I have tried below action items but all in vain: - Restarting the downstream and upstream(ConvertRecord) processor. - Disabled and enabled CSVReader and CSVRecordSetWriter. - Disabled load balancing. How can I debug this or resolve this? I am pretty new to nifi. Trying my best to understand the flow file life cycle and internal architecture of nifi. Any leads or assistance to solve the above two issues is very much appreciated. -- *Suman* *Tathastu*
