Hey Jens,

My recommendation is to take a look at the data provenance for MergeRecord 
(i.e., right-click on the Processor and go to Data Provenance.) Click the 
little ‘i’ icon on the left for one of the JOIN events.
There, it will show a “Details” field, which will tell you why it merged the 
data in the bin.
Once you understand why it’s merging the data with only 2 FlowFiles, you should 
be to understand how to adjust your configuration to avoid doing that.

Thanks
-Mark


> On Aug 30, 2022, at 2:31 AM, Jens M. Kofoed <[email protected]> 
> wrote:
> 
> Hi all
> 
> I'm running a 3 node cluster at version 1.16.2. I'm using the 
> SiteToSiteStatusReportingTask to monitor and check for any backpressures or 
> queues. I'm trying to merge all 3 reports into 1, but must of the times I 
> always get 2 flowfile after my MergeRecord.
> 
> To be sure the nodes are creating the reports at the same time the 
> SiteToSiteStatusReportingTask is set to schedule via CRON driver every 5 mins.
> The connection from the input port to the next process is set with "Load 
> Balance Strategy" to Single node, to be sure all 3 reports are at the same 
> node.
> In my MergeRecord the "Correlation Attribute Name" is set to 
> "reporting.task.uuid" which is the same for all 3 flowfiles.
> "Minimum Number of Records" is set to 5000, which is much higher than the 
> total amounts of records.
> "Minimum Bin Size" is set to 5 MB, which is also much higher than the total 
> size. Maximum "Number of Bins" is at default: 10
> "Max Bin Age" is set to 10 s.
> 
> With these setting I was hoping that all 3 reports, should be at the same 
> node within a few seconds. And that the mergeRecods will merge all 3 
> flowfiles into 1. But many time the mergeRecord outputs 2 flowfiles.
> 
> Any ideas how to force all into one flowfile.
> 
> Kind regards
> Jens M. Kofoed

Reply via email to