Craig and Jeremy, Thanks. The point about using different disks for different repositories is definitely something to add to the list.
On Fri, Sep 11, 2020 at 3:11 PM Jeremy Dyer <[email protected]> wrote: > > Hey Mike, > > When you say "flows that may drop in several million ... flowfiles" I read > that as a single node that might be inundated with tons of source data (local > files, ftp, kafka messages, etc). Just my 2 cents but if you don't have > strict SLAs (and this kind of sounds like a 1 time thing) I wouldn't even > worry about it and just let the system back pressure and process in time as > designed. That process will be "safe" although maybe not fast. If you need > speed throw lots of NVMe mounts at it. We process well into the tens > (sometimes hundreds) of millions of flowfiles a day on a 5 node cluster with > no issues. However our hardware is quite over the top. > > Thanks, > Jeremy Dyer > > On Fri, Sep 11, 2020 at 12:51 PM Mike Thomsen <[email protected]> wrote: >> >> What are the general recommended practices around tuning NiFi to >> safely handle flows that may drop in several million very small >> flowfiles (2k-10kb each) onto a single node? It's possible that some >> of the data dumps we're processing (and we can't control their size) >> will drop about 3.5-5M flowfiles the moment we expand them in the >> flow. >> >> (Let me emphasize again, it was not our idea to dump the data this way) >> >> Any pointers would be appreciated. >> >> Thanks, >> >> Mike
