[ 
https://issues.apache.org/jira/browse/NIFI-3225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15761716#comment-15761716
 ] 

Joseph Percivall commented on NIFI-3225:
----------------------------------------

Calling "get(X)" with a hardcoded number is how stateless processors used to do 
things[1] before the ability to set the "Run Duration"[2] was added. The run 
duration takes care of the batching together the checkpoint and commit. Also if 
there is setup that isn't dependent on the attributes or content and can be 
re-used, shouldn't it already be done in the OnScheduled?

Is there a specific use-case you have come across recently that warrants a need 
for this?


[1] 
https://github.com/apache/nifi/blob/0.x/nifi-nar-bundles/nifi-update-attribute-bundle/nifi-update-attribute-processor/src/main/java/org/apache/nifi/processors/attributes/UpdateAttribute.java#L338
[2] https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab

> Abstract Processor type that batches session.get() and session.commit() calls
> -----------------------------------------------------------------------------
>
>                 Key: NIFI-3225
>                 URL: https://issues.apache.org/jira/browse/NIFI-3225
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Bryan Rosander
>            Assignee: Bryan Rosander
>            Priority: Minor
>
> For processors that are stateless and support batching, it should be safe to 
> get and process multiple input FlowFiles for each onTrigger() call.  
> This should amortize the cost of session.get(), session.checkpoint(), 
> session.commit() as well as any setup in onTrigger() that isn't dependent on 
> the FlowFile(s) attributes or content.
> An AbstractBatchingProcessor type should reduce boilerplate code in candidate 
> processors and encourage uniform configurability via a property to control 
> batch size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to