Ryan, Funnels generally are very fast. They don’t have to read or write content. They don’t update the Provenance Repository. They just update the FlowFile Repository and do so in batches of up to 1,000 FlowFiles at a time. That said, they are cheap but they are not free.
You also need to consider that if you have 10 million FlowFiles in each Relationship, that means you’re doing a *LOT* of swapping. So it has to read/parse/deserialize all of those FlowFiles that are swapped to disk, and then re-serialize/re-write them to swap files. That does get quite expensive. With smaller queues, it’s probably a couple orders of magnitude more efficient. The point of funnels is really just to allow FlowFiles from multiple queues to put into the same queue so that they can be prioritized. But it will definitely be more efficient, if they have swapped data, to skip the funnel. Thanks -Mark > On Apr 22, 2021, at 10:56 AM, Ryan Hendrickson > <[email protected]> wrote: > > Hi everyone, > I'm curious what the performance impact is of adding a number of funnels. > > For example: We had 3 relationships with 10 million flow files in each > relationship going to a Funnel to create a single Relationship into a Merge > Content Processor. Would it be 'more' performant to just send these 3 > relationships directly to the Merge Content Processor? > > How does this work under-the-hood? Does the funnel read from each > relationship and create a "new" one? If so, how many flowfiles does it > pull at a time? Does it respect priority across the many > relationships going into the funnel? > > Thanks, > Ryan
