Hi,
Just thinking out loud, I wonder if you could take advantage of the
MergeContent processor.
Using the mergcontent processor, you would need to build a "correlation
attribute name" - aka the "bin key". The binkey would be common to all 5 of
your files. Your binkey would group all 5 of your batch files into a single
file. But more, importantly, you can set the MergeContent criteria bin age
to say 24 hours, which would give your users 1 day to provide all 5 files
that make/comprise your 'batch'. If they can't provide the files in 1 day,
then the bin would will still flush (based on the MaxAge exceeded), but you
could double check the 'output merged flowfile' with a routeAttribute
processor, and double check that the 5 file criteria was met, and if not
route the "failed batch" to an error/notification flow of some sort. (or
unpack it back into individual files, and re-ingest it back into the
mergeContent processor - for a 2nd try. You would probably add an attribute
named "retrycount", and test that so that a batch does not re-ingest
endlessly).
Carl


On Wed, Aug 16, 2017 at 9:01 AM, Andrew Loughran <[email protected]>
wrote:

> Hey everyone,
>
> This is my first post.
>
> I'm building out a pipeline with Nifi, but am stuck on an architectural
> decision around some fairly basic design.  I think I'm stuck as I'm
> operating on the wrong paradigm, but the application receiving my flow is
> the limitation in this context.
>
> I'm using ListS3 to poll for csv files.  There need to be 5 different
> types of file uploaded with a unique batch identifier for them to be
> released.  I'm using UpdateAttribute to rip the type and batch from the
> filename, then using wait to hold the batch.
>
> At the moment though, I'm holding until a batch has 5 files, rather than 5
> files with each attribute type matching the expected types.
>
> Is this the wrong way to be thinking about this problem, or does this
> sound like a good use case for Nifi - but using a better combination of
> processors.  If anyone could give me guidance or point me toward an example
> template for batch process I'd be grateful.
>
> Look forward to helping out in the community where I can.
>
> Thanks,
>
> Andy
>
>

Reply via email to