Mark Payne created NIFI-11837:
---------------------------------
Summary: When a queue starts swapping out data, it never stops
Key: NIFI-11837
URL: https://issues.apache.org/jira/browse/NIFI-11837
Project: Apache NiFi
Issue Type: Bug
Components: Core Framework
Reporter: Mark Payne
Assignee: Mark Payne
When a queue reaches the swap threshold (defined in nifi.properties as
{{nifi.queue.swap.threshold}} and defaulted to 20,000 FlowFiles), it enters
'swap mode'. However, it never exits swap mode.
This means that even if the queue is completely emptied, the data that does
enter the queue will be swapped out if the queue reaches 10K FlowFiles.
Additionally, there is significant overhead under the covers in handling this.
To replicate, create a simple flow:
GenerateFlowFile -> UpdateAttribute.
Set GenerateFlowFile to run with 6 threads, Run Schedule of "0 secs" and a Run
Duration of "100 ms". Auto-terminate the 'success' relationship of
UpdateAttribute
This will quickly fill the queue beyond 20K FlowFiles.
Now, stop GenerateFlowFile. Lower to 4 threads and a Run Duration of "10 ms"
Start both processors. Watch the logs indicating that data is constantly be
swapped in and out.
This can have a very significant impact on performance. In my testing on my
laptop, once this flow started swapping, its 5-minute stats dropped from 14.5
MM FlowFiles per 5 minutes down to 11 MM FlowFiles (roughly a 30% decline)
In addition to lower throughput, it causes much higher resource utilization,
which affects all flows.
This defect may affect anyone using a large number of small FlowFiles,
especially those where data may be bursty enough to exceed to 20,000 FlowFile
swapping limit or flows that have Backpressure Threshold set beyond 10,000.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)