Yeah, I have indeed no clue as to when all flowfiles are landed. Somehow I
need to figure out when that attribute changed, and act upon that event.

Currently looking at the FlowfileAggregationProcessor.

On Mon, Dec 19, 2016 at 6:29 PM, Lee Laim <[email protected]> wrote:

> Raf,
>
> You might be able to use PutFile and 'merge' your flowfiles in a temporary
> batch directory.  Once you are confident that all the flow files have
> landed, you can pull the contents of the directory.
>
> In other words, when a new directory shows up, pull the contents of the
> older directory back into the NiFi flow, then delete the old directory.
> This method provides some merging versatility, but will interrupt
> provenance as the flowfiles will be given a new UUID when brought back into
> the flow.
>
> Lee
>
>
>
>
> On Mon, Dec 19, 2016 at 9:41 AM, Jeff <[email protected]> wrote:
>
>> Hello Raf,
>>
>> MergeContent can merge based on a correlation ID (attribute).  However,
>> the merging currently operates in two modes: Defragment or Bin-Packing
>> Algorithm.  Defragment is completed by defragmenting based on the
>> correlation ID and a known number of fragments.  Bin-Packing Algorithm is
>> completed based on a min or max age of a "bin", and/or after a certain
>> number of flowfiles have been received.
>>
>> Based on your question, I'm assuming you will not know how many flowfiles
>> you'd be merging per attribute, so I'm not sure that MergeContent will work
>> for your use case.  Depending on how quickly you want those files merged
>> and sent downstream, a max bin age might work for you, though.  There is a
>> JIRA for implementing a more general-case aggregation processor [1].
>>
>> With some more details around your scenario we might be able to figure
>> out how to get it to work for you with the standard processors.
>>
>> [1] https://issues.apache.org/jira/browse/NIFI-1926
>>
>> On Mon, Dec 19, 2016 at 10:23 AM Raf Huys <[email protected]> wrote:
>>
>>> I want to batch incoming flowfiles based on an attribute. As soon as
>>> this attributes' value changes, the current batch should be transferred
>>> downstream and be reset. So basically I'm looking for a tumbling window.
>>>
>>> Can this be done with the MergeContent processor (which strategy?) or
>>> should I write my own processor?
>>>
>>>
>>> --
>>> tx
>>>
>>>
>


-- 
Mvg,

Raf Huys

Reply via email to