[jira] [Created] (NIFI-6517) Load Balanced Connections can show counts that are inaccurate, resulting in data not moving through connection

Mark Payne (JIRA) Thu, 01 Aug 2019 10:33:09 -0700

Mark Payne created NIFI-6517:
--------------------------------

             Summary: Load Balanced Connections can show counts that are 
inaccurate, resulting in data not moving through connection
                 Key: NIFI-6517
                 URL: https://issues.apache.org/jira/browse/NIFI-6517
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
            Reporter: Mark Payne
            Assignee: Mark Payne
             Fix For: 1.10.0



I've encountered an issue where data that is load balanced using the Round 
Robin strategy will show data in the queue but the data cannot be processed by 
the follow-on processor. List Queue indicates no FlowFiles, and Empty Queue 
indicates no Flow Files.

Error in the logs indicates that there is a bug in maintaining the proper size 
of the FlowFile Queue:
{code:java}
2019-08-01 11:39:08,422 INFO [Heartbeat Monitor Thread-1] 
o.a.n.c.c.h.AbstractHeartbeatMonitor Finished processing 2 heartbeats in 32480 
nanos
2019-08-01 11:39:08,422 INFO [Heartbeat Monitor Thread-1] 
o.a.n.c.c.node.NodeClusterCoordinator localhost:8482 requested disconnection 
from cluster due to Have not received a heartbeat from node in 40 seconds
2019-08-01 11:39:08,422 INFO [Heartbeat Monitor Thread-1] 
o.a.n.c.c.node.NodeClusterCoordinator Status of localhost:8482 changed from 
NodeConnectionStatus[nodeId=localhost:8482, state=CONNECTED, updateId=30] to 
NodeConnectionStatus[nodeId=localhost:8482, state=DISCONNECTED, Disconnect 
Code=Lack of Heartbeat, Disconnect Reason=Have not received a heartbeat from 
node in 40 seconds, updateId=31]
2019-08-01 11:39:08,441 ERROR [Load-Balanced Client Thread-2] 
o.a.n.c.queue.SwappablePriorityQueue Updated Size of Queue Unacknowledged from 
FlowFile Queue Size[ ActiveQueue=[500, 2560000 Bytes], Swap Queue=[4845, 
24806400 Bytes], Swap Files=[0], Unacknowledged=[0, 0 Bytes] ] to FlowFile 
Queue Size[ ActiveQueue=[500, 2560000 Bytes], Swap Queue=[4845, 24806400 
Bytes], Swap Files=[0], Unacknowledged=[-945, -4838400 Bytes] ]
java.lang.RuntimeException: Cannot create negative queue size
at 
org.apache.nifi.controller.queue.SwappablePriorityQueue.logIfNegative(SwappablePriorityQueue.java:945)
at 
org.apache.nifi.controller.queue.SwappablePriorityQueue.incrementUnacknowledgedQueueSize(SwappablePriorityQueue.java:935)
at 
org.apache.nifi.controller.queue.SwappablePriorityQueue.acknowledge(SwappablePriorityQueue.java:426)
at 
org.apache.nifi.controller.queue.clustered.partition.RemoteQueuePartition$1.onTransactionFailed(RemoteQueuePartition.java:160)
at 
org.apache.nifi.controller.queue.clustered.client.async.TransactionFailureCallback.onTransactionFailed(TransactionFailureCallback.java:26)
at 
org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClient.nodeDisconnected(NioAsyncLoadBalanceClient.java:295)
at 
org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClientTask.run(NioAsyncLoadBalanceClientTask.java:71)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745){code}
Note that this occurs immediately after the status of one of the other nodes in 
the cluster changes to DISCONNECTED.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (NIFI-6517) Load Balanced Connections can show counts that are inaccurate, resulting in data not moving through connection

Reply via email to