Hi all,
I have a job consist of following steps: first consuming data from kafka,
and then packing data every 5 minutes into one file, finally put the packed
file into hdfs.
I use the [MergeContent] processor to accomplish the “packing” step. The
properties of MergeContent I configured is list below:

----------------------
Merge Strategy: Bin-Packing Algorithm
Merge Format: Binary Concatenation
Attribute Strategy: Keep Only Common Attributes
Correlation Attribute Name: No value set
Metadata Strategy: Do Not Merge Uncommon Metadata
Minimum Number of Entries: 1
Maximum Number of Entries: 999999999
Minimum Group Size: 255 MB
Maximum Group Size:No value set
Max Bin Age: 5 minutes
Maximum number of Bins: 1
----------------------

I found the behavior of the MergeContent processor is very uncontrollable.
There are serveral workflows running on the nifi with the same
configuration of MergeContent processor, some workflows can packing the
data every 5 minutes into one file correctly, but some others can’t. It
even happened that some MergeContent processor generate one flowfile per
record.

I am wondering if I misunderstanding the machanism of MergeContent
processor.

An newbie of nifi, please help me.

Thanks!

Reply via email to