Greetings NiFi 2.4 user here (I plan to upgrade but have just not gotten to It yet)
I believe I may have noted an issue with MergeContent in defragment mode when the max number of bins is too small. I recreated it with a test flow. But before I report it as a bug, I would like someone to validate that my assumptions are correct. I've set up a test flow such that has 40,000 flow files. Each of the 40,000 files has a content of 0 bytes, and very few attributes other than fragment.count, fragment.identifier, and fragment.index.The 40,000 flow files have 10,000 unique identifiers, fragment.count is a consistent value of 4. Fragment.index varies from 0..3 I timed this flow specifically so that I have ALL 40,000 flow files sitting right at the input to a single merge content processor. (Note that in this example, the nifi is standalone, so there are no cloud issues.) My trouble seems related to maximum number of bins. If the max is LESS THAN 2500, I get a lot of failures, indicating that not all the fragments are present. If the count is more than 5000, everything merges FINE. (I haven't narrowed it down any further than that), and I end up back with the original 10,000 flow files (as I should) Admittedly, the bin size SHOULD be 10,000 for this test case. But from my reading, its not supposed to work that way. It SHOULD be recycling the bins as needed. Admittedly, this would be SLOW, but it shouldn't ERROR. It really doesn't make sense that 5000 worked. Feels arbitrary, given that 2500 did NOT. I noticed this because when I was authoring a new flow, I accidently left the maximum number of bins to the default value of 5. It had trouble. So the ultimate question : is this a bug I should report? Or am I not understanding something fundamental? Geoffrey Greene ATF / Senior Software Ninjaneer
