I used the MergeContent processor to work out whether I had the five unique files. Thanks for your suggestion - I like it.
On Thu, 17 Aug 2017 at 15:35 Carl Berndt <[email protected]> wrote: > Hi, > Just thinking out loud, I wonder if you could take advantage of the > MergeContent processor. > Using the mergcontent processor, you would need to build a "correlation > attribute name" - aka the "bin key". The binkey would be common to all 5 of > your files. Your binkey would group all 5 of your batch files into a single > file. But more, importantly, you can set the MergeContent criteria bin age > to say 24 hours, which would give your users 1 day to provide all 5 files > that make/comprise your 'batch'. If they can't provide the files in 1 day, > then the bin would will still flush (based on the MaxAge exceeded), but you > could double check the 'output merged flowfile' with a routeAttribute > processor, and double check that the 5 file criteria was met, and if not > route the "failed batch" to an error/notification flow of some sort. (or > unpack it back into individual files, and re-ingest it back into the > mergeContent processor - for a 2nd try. You would probably add an attribute > named "retrycount", and test that so that a batch does not re-ingest > endlessly). > Carl > > > On Wed, Aug 16, 2017 at 9:01 AM, Andrew Loughran <[email protected]> > wrote: > >> Hey everyone, >> >> This is my first post. >> >> I'm building out a pipeline with Nifi, but am stuck on an architectural >> decision around some fairly basic design. I think I'm stuck as I'm >> operating on the wrong paradigm, but the application receiving my flow is >> the limitation in this context. >> >> I'm using ListS3 to poll for csv files. There need to be 5 different >> types of file uploaded with a unique batch identifier for them to be >> released. I'm using UpdateAttribute to rip the type and batch from the >> filename, then using wait to hold the batch. >> >> At the moment though, I'm holding until a batch has 5 files, rather than >> 5 files with each attribute type matching the expected types. >> >> Is this the wrong way to be thinking about this problem, or does this >> sound like a good use case for Nifi - but using a better combination of >> processors. If anyone could give me guidance or point me toward an example >> template for batch process I'd be grateful. >> >> Look forward to helping out in the community where I can. >> >> Thanks, >> >> Andy >> >> >
