Hi Josef, Thanks for reply. In my opinion the “Minimum Number of Entries” is should not and can not stronger than the “Max Bin Age”. Suppose I have only ONE flowfile from datasource put into MergeContent processor, and I set "Minimum Number of Entries" = 2, then this ONE flowfile will never coming out from nifi, even if it reach the deadline of bin. This is very easy lead to dead lock.
And I don't know how to use the “Merge Strategy: Defragment” to merge the flowfile from kafka, I really don't know the speed the producer produce the messge. Jianan Zhang <[email protected]> 于2019年1月4日周五 下午6:43写道: > Hi Jianan > > > > As you have “Minimum Number of Entries: 1” it is normal that you can see > merges with only one flowfile. In my opinion the “Minimum Number of > Entries” is stronger than the “Max Bin Age” (first is written bold and > second not). Additionally it is called “Max Bin Age” and not “Bin Age”. So > as soon as you reach at least 1 flowfile it could be pushed out. However, > in my opinion the documentation for “Max Bin Age” is to unspecific (when > does it really takes place?), only the developers know exactly the function > behind it. Would be great to get more information here… > > > > Just my 2 cents. Whenever possible try to use “Merge Strategy: Defragment” > instead of the current one, but this is working only if it is predictable > how many flowfiles you would like to merge. With this strategy the max bin > age makes fully sense and works as expected. > > > > Cheers Josef > > > > > > *From: *Jianan Zhang <[email protected]> > *Reply-To: *"[email protected]" <[email protected]> > *Date: *Friday, 4 January 2019 at 11:16 > *To: *"[email protected]" <[email protected]> > *Subject: *A question about [MergeContent] processor > > > > Hi all, > > I have a job consist of following steps: first consuming data from kafka, > and then packing data every 5 minutes into one file, finally put the packed > file into hdfs. > > I use the [MergeContent] processor to accomplish the “packing” step. The > properties of MergeContent I configured is list below: > > > > ---------------------- > > Merge Strategy: Bin-Packing Algorithm > > Merge Format: Binary Concatenation > > Attribute Strategy: Keep Only Common Attributes > > Correlation Attribute Name: No value set > > Metadata Strategy: Do Not Merge Uncommon Metadata > > Minimum Number of Entries: 1 > > Maximum Number of Entries: 999999999 > > Minimum Group Size: 255 MB > > Maximum Group Size:No value set > > Max Bin Age: 5 minutes > > Maximum number of Bins: 1 > > ---------------------- > > > > I found the behavior of the MergeContent processor is very uncontrollable. > There are serveral workflows running on the nifi with the same > configuration of MergeContent processor, some workflows can packing the > data every 5 minutes into one file correctly, but some others can’t. It > even happened that some MergeContent processor generate one flowfile per > record. > > > > I am wondering if I misunderstanding the machanism of MergeContent > processor. > > > > An newbie of nifi, please help me. > > > > Thanks! >
