Alexander Bukarev created NIFI-5918:
---------------------------------------
Summary: MergeRecord works wrong with Defragment strategy
Key: NIFI-5918
URL: https://issues.apache.org/jira/browse/NIFI-5918
Project: Apache NiFi
Issue Type: Bug
Components: Extensions
Affects Versions: 1.8.0
Reporter: Alexander Bukarev
*Steps*
# Create the simple flow:
#* {{GenerateFlowFile}} (with constant payload "txt1,txt2" and 10 secs
schedulling)
#* -> {{SplitContent}} (with comma as a separator)
#* -> some chain of processors which get "txt1" and "txt2" as a inbound params
and produce flowfiles with more than 1 record ((!) that's important). For
example, I use {{ExtractText}} (to get "txt1" and "txt2" as an attribute), then
{{ExecuteSQLRecord}} (to execute SQL using "txt1" and "txt2" as a parameter)
#* -> {{MergeRecord}} (with *Defragment* merge strategy - (!) that's important)
#* -> {{LogAttribute}} or whatever you prefer to observe the merge result
# Now just run the flow
*Result:* we'll see an error in logs like {panel}Could not merge bin with 1
FlowFiles because of the 'fragment.count' attribute had a value of '2' but only
1 of 2 FlowFiles were encountered before this bin was evicted (due to to Max
Bin Age being reached or due to the Maximum Number of Bins being
exceeded).{panel}
*Expected result:* the flow file containing records from both SQL queries (for
"txt1" and "txt2")
The cause is {{RecordBinManager}} uses {{fragment.count}} flow file attribute
to calculate required *record* number to release the bin. However, the
attribute contains the number of *flow files* instead. As in above scenario
each file contains more than 1 records (at least 2) that means {{RecordBin}}
thinks the bin is "full enough" when first flow file arrives (because it
contains >= 2 records and {{fragment.count}} is equal to 2 in the scenario). So
the bin is released wrongly.
I think there is a mistake and in *Defragment* mode we are interested in a
number of flow files and never in records number. In opposite, we should care
about a number of records usin Bin-Packaging Algorithm.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)