[ 
https://issues.apache.org/jira/browse/NIFI-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518385#comment-16518385
 ] 

ASF GitHub Bot commented on NIFI-4838:
--------------------------------------

Github user mattyb149 commented on the issue:

    https://github.com/apache/nifi/pull/2448
  
    From an awesome suggestion by @markap14, you could extend 
AbstractSessionFactoryProcessor (although this has implications since you're 
sharing a base class already) and use two sessions, one to get the incoming 
flow file, and one to create the child flow files. Then you can use 
session1.get() and save off the FlowFile, call session2.create(flowFile), and 
session2.commit() as many times as you want. Then at the end of processing you 
can do session1.transfer(flowFile, REL_ORIGINAL) and session1.commit().
    
    That's a great solution IMO because it retains the "original" use case and 
behavior while still allowing incremental commits. We should consider this 
pattern when doing any source processor that allows incoming flow files and 
also wants to offer incremental commits.


> Make GetMongo support multiple commits and give some progress indication
> ------------------------------------------------------------------------
>
>                 Key: NIFI-4838
>                 URL: https://issues.apache.org/jira/browse/NIFI-4838
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Mike Thomsen
>            Assignee: Mike Thomsen
>            Priority: Major
>
> It shouldn't wait until the end to do a commit() call because the effect is 
> that GetMongo looks like it has hung to a user who is pulling a very large 
> data set.
> It should also have an option for running a count query to get the current 
> approximate count of documents that would match the query and append an 
> attribute that indicates where a flowfile stands in the total result count. 
> Ex:
> query.progress.point.start = 2500
> query.progress.point.end = 5000
> query.count.estimate = 17,568,231



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to