Huagen, Sorry, I am a little confused. My understanding is that you want to combine n individual logs (each with a respective flowfile) from a specific hour into a single file. What is confusing is when you say “Even with that [a 5* confirmation loop], I occasionally still get more than one merged flowfile.” Do you mean that what you expected to be combined into a single flowfile is output as two distinct and incomplete flowfiles?
Without seeing a template of your work flow, I can make a couple of suggestions. First, as mentioned last night by James Wing, I would encourage you to look at the MergeContent [1] processor properties to provide a high threshold for merging flowfiles. If you know the number of log files per hour a priori, you can set that as the “Minimum Number of Entries” and ensure that output will wait until that many flowfiles have been accumulated. Also, given that you have described a “loop”, I would imagine you may have multiple connections feeding into MergeContent. MergeContent can have unexpected behavior with multiple incoming connections, and so I would recommend adding a Funnel to aggregate all incoming connections and provide a single incoming connection to MergeContent. Please let us know if this helps, and if not, please share a template and some sample input if possible. Thanks. [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/index.html Andy LoPresto [email protected] [email protected] PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > On Jun 1, 2016, at 11:52 AM, Huagen peng <[email protected]> wrote: > > Hi, > > In the data flow I am dealing with now, there are multiple (up to 200) logs > associated with a given hour. I need to process these fragment hourly logs > and then concatenate them into a single file. The approach I am using now > has an UpdateAttribute processor to set an arbitrary > segment.original.filename attribute on all the flowfiles I want to merge. > Then I use a MergeContent processor, with an UpdateAttribute and > RouteOnAttribute processor to form a loop to confirm five times that the > merge is complete. Even with that, I occasionally still get more than one > merged flowfile. > > Is there a better way to do this? Or should I increase the loop count, say > 10? > > Thanks. > > Huagen
signature.asc
Description: Message signed with OpenPGP using GPGMail
