[ 
https://issues.apache.org/jira/browse/NIFI-8603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Secules updated NIFI-8603:
-------------------------------
    Description: 
*Use Case:*
 A processor is splitting  one flowfile into thousands or millions of 
flowfiles. It's preferred to keep all those flowfiles in the process session in 
case there is some error and we need to roll back. However, the process session 
stores the FlowFiles in Maps on the heap while the session is waiting to commit.

One workaround is to commit the session early while processing splits and 
require the user to use wait/notify to hold the split portions until the whole 
input file is successfully split. This takes advantage of the fact that NiFi 
queues are swapped to disk after they reach a certain size.

Another workaround is to use an off-heap collection in the processor code to 
store flowfiles to be transferred and only transfer them to the session and 
commit in a loop when you otherwise would have just committed.

It would be great if the process session had a similar behaviour to NiFi queues 
between processors so that it is able to hold a large amount of flowfiles in 
the session without consuming the whole heap.

 

*Acceptance Criteria:*
 * NiFi StandardProcessSession can handle arbitrarily large transactions by 
holding flowfiles off-heap or partially-off heap

  was:
*Use Case:*
A processor is splitting  one flowfile into thousands or millions of flowfiles. 
It's preferred to keep all those flowfiles in the process session in case there 
is some error and we need to roll back. However, the process session stores the 
FlowFiles in Maps on the heap while the session is waiting to commit.

One workaround is to commit the session early while processing splits and 
require the user to use wait/notify to hold the split portions until the whole 
input file is successfully split. This takes advantage of the fact that NiFi 
queues are swapped to disk after they reach a certain size.

It would be great if the process session had a similar behaviour to NiFi queues 
between processors so that it is able to hold a large amount of flowfiles in 
the session without consuming the whole heap.

 

*Acceptance Criteria:*


 * NiFi StandardProcessSession can handle arbitrarily large transactions by 
holding flowfiles off-heap or partially-off heap


> Use non-heap collections in StandardProcessSession
> --------------------------------------------------
>
>                 Key: NIFI-8603
>                 URL: https://issues.apache.org/jira/browse/NIFI-8603
>             Project: Apache NiFi
>          Issue Type: New Feature
>          Components: Core Framework
>    Affects Versions: 1.13.2
>            Reporter: Eric Secules
>            Priority: Minor
>
> *Use Case:*
>  A processor is splitting  one flowfile into thousands or millions of 
> flowfiles. It's preferred to keep all those flowfiles in the process session 
> in case there is some error and we need to roll back. However, the process 
> session stores the FlowFiles in Maps on the heap while the session is waiting 
> to commit.
> One workaround is to commit the session early while processing splits and 
> require the user to use wait/notify to hold the split portions until the 
> whole input file is successfully split. This takes advantage of the fact that 
> NiFi queues are swapped to disk after they reach a certain size.
> Another workaround is to use an off-heap collection in the processor code to 
> store flowfiles to be transferred and only transfer them to the session and 
> commit in a loop when you otherwise would have just committed.
> It would be great if the process session had a similar behaviour to NiFi 
> queues between processors so that it is able to hold a large amount of 
> flowfiles in the session without consuming the whole heap.
>  
> *Acceptance Criteria:*
>  * NiFi StandardProcessSession can handle arbitrarily large transactions by 
> holding flowfiles off-heap or partially-off heap



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to