[
https://issues.apache.org/jira/browse/NIFI-11837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Handermann updated NIFI-11837:
------------------------------------
Fix Version/s: 1.23.0
(was: 1.latest)
> When a queue starts swapping out data, it never stops
> -----------------------------------------------------
>
> Key: NIFI-11837
> URL: https://issues.apache.org/jira/browse/NIFI-11837
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
> Fix For: 2.0.0, 1.23.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When a queue reaches the swap threshold (defined in nifi.properties as
> {{nifi.queue.swap.threshold}} and defaulted to 20,000 FlowFiles), it enters
> 'swap mode'. However, it never exits swap mode.
> This means that even if the queue is completely emptied, the data that does
> enter the queue will be swapped out if the queue reaches 10K FlowFiles.
> Additionally, there is significant overhead under the covers in handling this.
> To replicate, create a simple flow:
> GenerateFlowFile -> UpdateAttribute.
> Set GenerateFlowFile to run with 6 threads, Run Schedule of "0 secs" and a
> Run Duration of "100 ms". Auto-terminate the 'success' relationship of
> UpdateAttribute
> This will quickly fill the queue beyond 20K FlowFiles.
> Now, stop GenerateFlowFile. Lower to 4 threads and a Run Duration of "10 ms"
> Start both processors. Watch the logs indicating that data is constantly be
> swapped in and out.
> This can have a very significant impact on performance. In my testing on my
> laptop, once this flow started swapping, its 5-minute stats dropped from 14.5
> MM FlowFiles per 5 minutes down to 11 MM FlowFiles (roughly a 30% decline)
> In addition to lower throughput, it causes much higher resource utilization,
> which affects all flows.
> This defect may affect anyone using a large number of small FlowFiles,
> especially those where data may be bursty enough to exceed to 20,000 FlowFile
> swapping limit or flows that have Backpressure Threshold set beyond 10,000.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)