Re: Best way to tune NiFi for huge amounts of small flowfiles

Ryan Hendrickson Tue, 15 Sep 2020 10:50:24 -0700

We keep our queue limit at 20,000 to keep data from swapping between
ArrayLists and Prioritized Queues.  See bug:
https://issues.apache.org/jira/browse/NIFI-7583


You can also adjust that limit up in the nifi.properties.

On Sat, Sep 12, 2020 at 1:15 AM Chris Sampson <chris.samp...@naimuri.com>
wrote:

> One thing we've not done yet but I think might help is to stripe disks for
> each repo too, i.e. multiple disks for content, etc., which will help
> spread the disk I/O.
>
>
> Cheers,
>
> Chris Sampson
>
> On Fri, 11 Sep 2020, 22:46 Mike Thomsen, <mikerthom...@gmail.com> wrote:
>
>> Craig and Jeremy,
>>
>> Thanks. The point about using different disks for different
>> repositories is definitely something to add to the list.
>>
>> On Fri, Sep 11, 2020 at 3:11 PM Jeremy Dyer <jdy...@gmail.com> wrote:
>> >
>> > Hey Mike,
>> >
>> > When you say "flows that may drop in several million ... flowfiles" I
>> read that as a single node that might be inundated with tons of source data
>> (local files, ftp, kafka messages, etc). Just my 2 cents but if you don't
>> have strict SLAs (and this kind of sounds like a 1 time thing) I wouldn't
>> even worry about it and just let the system back pressure and process in
>> time as designed. That process will be "safe" although maybe not fast. If
>> you need speed throw lots of NVMe mounts at it. We process well into the
>> tens (sometimes hundreds) of millions of flowfiles a day on a 5 node
>> cluster with no issues. However our hardware is quite over the top.
>> >
>> > Thanks,
>> > Jeremy Dyer
>> >
>> > On Fri, Sep 11, 2020 at 12:51 PM Mike Thomsen <mikerthom...@gmail.com>
>> wrote:
>> >>
>> >> What are the general recommended practices around tuning NiFi to
>> >> safely handle flows that may drop in several million very small
>> >> flowfiles (2k-10kb each) onto a single node? It's possible that some
>> >> of the data dumps we're processing (and we can't control their size)
>> >> will drop about 3.5-5M flowfiles the moment we expand them in the
>> >> flow.
>> >>
>> >> (Let me emphasize again, it was not our idea to dump the data this way)
>> >>
>> >> Any pointers would be appreciated.
>> >>
>> >> Thanks,
>> >>
>> >> Mike
>>
>

Re: Best way to tune NiFi for huge amounts of small flowfiles

Reply via email to