Hey Mike,

When you say "flows that may drop in several million ... flowfiles" I read
that as a single node that might be inundated with tons of source data
(local files, ftp, kafka messages, etc). Just my 2 cents but if you don't
have strict SLAs (and this kind of sounds like a 1 time thing) I wouldn't
even worry about it and just let the system back pressure and process in
time as designed. That process will be "safe" although maybe not fast. If
you need speed throw lots of NVMe mounts at it. We process well into the
tens (sometimes hundreds) of millions of flowfiles a day on a 5 node
cluster with no issues. However our hardware is quite over the top.

Thanks,
Jeremy Dyer

On Fri, Sep 11, 2020 at 12:51 PM Mike Thomsen <mikerthom...@gmail.com>
wrote:

> What are the general recommended practices around tuning NiFi to
> safely handle flows that may drop in several million very small
> flowfiles (2k-10kb each) onto a single node? It's possible that some
> of the data dumps we're processing (and we can't control their size)
> will drop about 3.5-5M flowfiles the moment we expand them in the
> flow.
>
> (Let me emphasize again, it was not our idea to dump the data this way)
>
> Any pointers would be appreciated.
>
> Thanks,
>
> Mike
>

Reply via email to