Mark Payne created NIFI-11837:
---------------------------------

             Summary: When a queue starts swapping out data, it never stops
                 Key: NIFI-11837
                 URL: https://issues.apache.org/jira/browse/NIFI-11837
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
            Reporter: Mark Payne
            Assignee: Mark Payne


When a queue reaches the swap threshold (defined in nifi.properties as 
{{nifi.queue.swap.threshold}} and defaulted to 20,000 FlowFiles), it enters 
'swap mode'. However, it never exits swap mode.

This means that even if the queue is completely emptied, the data that does 
enter the queue will be swapped out if the queue reaches 10K FlowFiles. 
Additionally, there is significant overhead under the covers in handling this.

To replicate, create a simple flow:

  GenerateFlowFile -> UpdateAttribute.

Set GenerateFlowFile to run with 6 threads, Run Schedule of "0 secs" and a Run 
Duration of "100 ms". Auto-terminate the 'success' relationship of 
UpdateAttribute

This will quickly fill the queue beyond 20K FlowFiles.

Now, stop GenerateFlowFile. Lower to 4 threads and a Run Duration of "10 ms"

Start both processors. Watch the logs indicating that data is constantly be 
swapped in and out.

This can have a very significant impact on performance. In my testing on my 
laptop, once this flow started swapping, its 5-minute stats dropped from 14.5 
MM FlowFiles per 5 minutes down to 11 MM FlowFiles (roughly a 30% decline)

In addition to lower throughput, it causes much higher resource utilization, 
which affects all flows.

This defect may affect anyone using a large number of small FlowFiles, 
especially those where data may be bursty enough to exceed to 20,000 FlowFile 
swapping limit or flows that have Backpressure Threshold set beyond 10,000.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to