Hey Jens, My recommendation is to take a look at the data provenance for MergeRecord (i.e., right-click on the Processor and go to Data Provenance.) Click the little ‘i’ icon on the left for one of the JOIN events. There, it will show a “Details” field, which will tell you why it merged the data in the bin. Once you understand why it’s merging the data with only 2 FlowFiles, you should be to understand how to adjust your configuration to avoid doing that.
Thanks -Mark > On Aug 30, 2022, at 2:31 AM, Jens M. Kofoed <[email protected]> > wrote: > > Hi all > > I'm running a 3 node cluster at version 1.16.2. I'm using the > SiteToSiteStatusReportingTask to monitor and check for any backpressures or > queues. I'm trying to merge all 3 reports into 1, but must of the times I > always get 2 flowfile after my MergeRecord. > > To be sure the nodes are creating the reports at the same time the > SiteToSiteStatusReportingTask is set to schedule via CRON driver every 5 mins. > The connection from the input port to the next process is set with "Load > Balance Strategy" to Single node, to be sure all 3 reports are at the same > node. > In my MergeRecord the "Correlation Attribute Name" is set to > "reporting.task.uuid" which is the same for all 3 flowfiles. > "Minimum Number of Records" is set to 5000, which is much higher than the > total amounts of records. > "Minimum Bin Size" is set to 5 MB, which is also much higher than the total > size. Maximum "Number of Bins" is at default: 10 > "Max Bin Age" is set to 10 s. > > With these setting I was hoping that all 3 reports, should be at the same > node within a few seconds. And that the mergeRecods will merge all 3 > flowfiles into 1. But many time the mergeRecord outputs 2 flowfiles. > > Any ideas how to force all into one flowfile. > > Kind regards > Jens M. Kofoed
