[
https://issues.apache.org/jira/browse/NIFI-1008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Payne updated NIFI-1008:
-----------------------------
Fix Version/s: (was: 0.6.0)
> NiFi should swap out FlowFiles to disk even before the session is committed
> ---------------------------------------------------------------------------
>
> Key: NIFI-1008
> URL: https://issues.apache.org/jira/browse/NIFI-1008
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Mark Payne
>
> Currently, NiFi will swap out FlowFiles if there are a large number in a
> FlowFile Queue. This is done to avoid running out of JVM heap space. However,
> if we have a simple flow like GetFile -> SplitText and GetFile pulls in a
> large file, SplitText can quickly cause OutOfMemoryError. This is not because
> it buffers the content of the FlowFile in memory but rather because it holds
> the millions of FlowFile objects in memory. We can do better.
> When we call session.transfer for the FlowFiles, once we hit a magical
> threshold (say 10,000), we should swap those FlowFiles to disk and the
> session should transfer them to the queue "swapped out" flowfiles, rather
> than having to buffer all of these in memory and then swapping them out once
> they land in the queue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)