We keep our queue limit at 20,000 to keep data from swapping between
ArrayLists and Prioritized Queues. See bug:
https://issues.apache.org/jira/browse/NIFI-7583
You can also adjust that limit up in the nifi.properties.
On Sat, Sep 12, 2020 at 1:15 AM Chris Sampson
wrote:
> One thing we've not
One thing we've not done yet but I think might help is to stripe disks for
each repo too, i.e. multiple disks for content, etc., which will help
spread the disk I/O.
Cheers,
Chris Sampson
On Fri, 11 Sep 2020, 22:46 Mike Thomsen, wrote:
> Craig and Jeremy,
>
> Thanks. The point about using
Craig and Jeremy,
Thanks. The point about using different disks for different
repositories is definitely something to add to the list.
On Fri, Sep 11, 2020 at 3:11 PM Jeremy Dyer wrote:
>
> Hey Mike,
>
> When you say "flows that may drop in several million ... flowfiles" I read
> that as a
Hey Mike,
When you say "flows that may drop in several million ... flowfiles" I read
that as a single node that might be inundated with tons of source data
(local files, ftp, kafka messages, etc). Just my 2 cents but if you don't
have strict SLAs (and this kind of sounds like a 1 time thing) I
Hi Mike,
I might have a few more pointers to offer when I can get unburied from some
other work ... but the couple things that jump to mind are the following:
- I think for that many flowfiles, you will want to make sure you have
separate disks set up for data provenance. We have several