Hi Koji, 

My test is as follows.
ProcessorA, scheduled only on primary node and with only one cocurrency. 
The result of ProcessorA load balanced to ProcessorB.  The strategy is by 
attribute.  All the output FlowFiles of ProcessorA has  the same attribute used 
for balance, so all FlowFiles will be balanced to the same node. 
The order of ProcessorB received will probably not the same as ProcessorA 
emited. And the order is nondeterministic. 

Thanks,
Lei



[email protected]
 
From: Koji Kawamura
Date: 2019-10-20 18:02
To: users
Subject: Re: Re: MergeRecord can not guarantee the ordering of the input 
sequence?
Hi Lei,
 
Does 'balance strategy' means load balance strategy? Which strategy
are you using? I thought Prioritizers are applied on the destination
node after load balancing has transferred FlowFiles. Are those A, B
and C flow files generated on different nodes and sent to a single
node to merge them?
 
Thanks,
Koji
 
On Fri, Oct 18, 2019 at 7:12 PM [email protected]
<[email protected]> wrote:
>
>
> Seems it is because of the balance strategy that is used.
> The balance will not guarantee the the order.
>
> Thanks,
> Lei
>
> ________________________________
> [email protected]
>
>
> From: [email protected]
> Date: 2019-10-16 10:21
> To: dev; users
> CC: dev
> Subject: Re: Re: MergeRecord can not guarantee the ordering of the input 
> sequence?
> Hi Koji,
> Actually i have set all connections to FIFO and concurrency tasks to 1 for 
> all processors.
> Before and after the MergeRecord, I add a LogAttribute to debug.
>
> Before MergeRecord,the order in logfile is A,B,C in three flowfile
> After  MergeRecord, the order becomes {A,C,B} in one flowfile
> This is nondeterministic.
>
> I think I should look up the MergeRecord code and do further debug.
>
> Thanks,
> Lei
>
>
>
>
> [email protected]
> From: Koji Kawamura
> Date: 2019-10-16 09:46
> To: users
> CC: dev
> Subject: Re: MergeRecord can not guarantee the ordering of the input sequence?
> Hi Lei,
> How about setting FIFO prioritizer at all the preceding connections
> before the MergeRecord?
> Without setting any prioritizer, FlowFile ordering is nondeterministic.
> Thanks,
> Koji
> On Tue, Oct 15, 2019 at 8:56 PM [email protected]
> <[email protected]> wrote:
> >
> >
> > If  FlowFile A, B, C enter the MergeRecord sequentially, the output should 
> > be one FlowFile {A, B, C}
> > However, when testing with  large data volume, sometimes the output order 
> > will be not the same as they enter. And this result is nondeterministic
> >
> > This really confuses me a lot.
> > Anybody has any insight on this?
> >
> > Thanks,
> > Lei
> >
> > ________________________________
> > [email protected]

Reply via email to