Re: Best way to tune NiFi for huge amounts of small flowfiles

Mike Thomsen Fri, 11 Sep 2020 14:46:36 -0700

Craig and Jeremy,

Thanks. The point about using different disks for different
repositories is definitely something to add to the list.


On Fri, Sep 11, 2020 at 3:11 PM Jeremy Dyer <[email protected]> wrote:
>
> Hey Mike,
>
> When you say "flows that may drop in several million ... flowfiles" I read 
> that as a single node that might be inundated with tons of source data (local 
> files, ftp, kafka messages, etc). Just my 2 cents but if you don't have 
> strict SLAs (and this kind of sounds like a 1 time thing) I wouldn't even 
> worry about it and just let the system back pressure and process in time as 
> designed. That process will be "safe" although maybe not fast. If you need 
> speed throw lots of NVMe mounts at it. We process well into the tens 
> (sometimes hundreds) of millions of flowfiles a day on a 5 node cluster with 
> no issues. However our hardware is quite over the top.
>
> Thanks,
> Jeremy Dyer
>
> On Fri, Sep 11, 2020 at 12:51 PM Mike Thomsen <[email protected]> wrote:
>>
>> What are the general recommended practices around tuning NiFi to
>> safely handle flows that may drop in several million very small
>> flowfiles (2k-10kb each) onto a single node? It's possible that some
>> of the data dumps we're processing (and we can't control their size)
>> will drop about 3.5-5M flowfiles the moment we expand them in the
>> flow.
>>
>> (Let me emphasize again, it was not our idea to dump the data this way)
>>
>> Any pointers would be appreciated.
>>
>> Thanks,
>>
>> Mike

Re: Best way to tune NiFi for huge amounts of small flowfiles

Reply via email to