Yeah, I have indeed no clue as to when all flowfiles are landed. Somehow I need to figure out when that attribute changed, and act upon that event.
Currently looking at the FlowfileAggregationProcessor. On Mon, Dec 19, 2016 at 6:29 PM, Lee Laim <[email protected]> wrote: > Raf, > > You might be able to use PutFile and 'merge' your flowfiles in a temporary > batch directory. Once you are confident that all the flow files have > landed, you can pull the contents of the directory. > > In other words, when a new directory shows up, pull the contents of the > older directory back into the NiFi flow, then delete the old directory. > This method provides some merging versatility, but will interrupt > provenance as the flowfiles will be given a new UUID when brought back into > the flow. > > Lee > > > > > On Mon, Dec 19, 2016 at 9:41 AM, Jeff <[email protected]> wrote: > >> Hello Raf, >> >> MergeContent can merge based on a correlation ID (attribute). However, >> the merging currently operates in two modes: Defragment or Bin-Packing >> Algorithm. Defragment is completed by defragmenting based on the >> correlation ID and a known number of fragments. Bin-Packing Algorithm is >> completed based on a min or max age of a "bin", and/or after a certain >> number of flowfiles have been received. >> >> Based on your question, I'm assuming you will not know how many flowfiles >> you'd be merging per attribute, so I'm not sure that MergeContent will work >> for your use case. Depending on how quickly you want those files merged >> and sent downstream, a max bin age might work for you, though. There is a >> JIRA for implementing a more general-case aggregation processor [1]. >> >> With some more details around your scenario we might be able to figure >> out how to get it to work for you with the standard processors. >> >> [1] https://issues.apache.org/jira/browse/NIFI-1926 >> >> On Mon, Dec 19, 2016 at 10:23 AM Raf Huys <[email protected]> wrote: >> >>> I want to batch incoming flowfiles based on an attribute. As soon as >>> this attributes' value changes, the current batch should be transferred >>> downstream and be reset. So basically I'm looking for a tumbling window. >>> >>> Can this be done with the MergeContent processor (which strategy?) or >>> should I write my own processor? >>> >>> >>> -- >>> tx >>> >>> > -- Mvg, Raf Huys
