I sent this to the users group, but I thought I'd try one last time, since I 
got no response, and I just got burned by this again.

Greetings

NiFi 2.4 user here (I plan to upgrade but have just not gotten to It yet)

I believe I may have noted an issue with MergeContent in defragment mode when 
the max number of bins is too small.

I recreated it with a test flow. But before I report it as a bug, I would like 
someone to validate that my assumptions are correct.

I've set up a test flow such that has 10,000 empty flow files.
Via update attribute, each of the 10K flow files is assigned a unique 
fragment.identifier, and fragment.count is assigned a value of 4
I then duplicate each flow file 3 times, so I have 40,000 flow files.  Via the 
duplicate flow file,  fragment.index varies then from 0..3.  There are NO OTHER 
attributes, and there is no content in the flow files.

I then run this flow slowly, timing it specifically so that 40,000 flow files 
sitting right at the input  to a single merge content processor. (Note that in 
this example, the nifi is standalone, so there are no cloud issues.)

My trouble seems related to maximum number of bins.  If the max is LESS THAN 
2500, I get a lot of failures, indicating that not all the fragments are 
present.
If the count is more than 5000, everything merges FINE. (I haven't narrowed it 
down any further than that), and I end up back with the original 10,000 flow 
files (as I should)

Admittedly, the bin size SHOULD be 10,000 for this test case.  But from my 
reading, its not supposed to work that way.  It SHOULD be recycling the bins as 
needed.  Admittedly, this would be SLOW, but it shouldn't ERROR.  It really 
doesn't make sense that 5000 worked.  Feels arbitrary, given that 2500 did NOT.

I noticed this because when I was authoring a new flow, I accidently left the 
maximum number of bins to the default value of 5. It had trouble.

So the ultimate question : is this a bug I should report? Or am I not 
understanding something fundamental?

Geoffrey Greene
ATF / Senior Software Ninjaneer


Reply via email to