Re: Tuning for flow with lots of processors

Eric Secules Wed, 25 Nov 2020 12:21:43 -0800

Thanks for the tips Mark!

I looked at the summary and there are a fair number of processors at the
top of the list which create flowfiles from an outside source, but there
are also several anomalies. For example there are some processors that are
mid-flow which usually process each flowfile in less than a second, but
sometimes average execution time balloons to several seconds and don't
correlate with flowfile size. I see this behavior on network IO bound
processors (most IO is done to docker containers on the same host), and
even data processors like ReplaceText (full text regex) where I saw
execution time go up to 30 seconds even though the input file size and
content is always the same(22.5KB) It usually takes a couple milliseconds
to process the same file.


I see no backpressure in the flow, but depending on when I look at the
summary I don't always see the anomalies I mentioned above, sometimes it
looks totally acceptable. But other times I see processors like ReplaceText
(which only has one concurrent task) be active for 4 of the past 5 minutes.

I tried looking at the disk metrics in Azure, and it says we aren't near
our quota of 120 IOPS and we do have burst capacity of up to 3100 IOPS.
During testing we are steady at about 20 IOPS.

All 3 file repositories (content, provenance, flow file) are stored on the
OS disk which is at 79% capacity. Could constant pruning of the content
repo be the cause of the intermittent slowness? We have the following
setting: nifi.content.repository.archive.max.usage.percentage=50%

Thanks,
Eric

On Wed, Nov 25, 2020 at 6:31 AM Mark Payne <[email protected]> wrote:

> Eric,
>
> There’s nothing that I know of that went into 1.12.1 that would cause the
> dataflow to be any slower. And I’d expect to have heard about it from
> others if there were. There is a chance, though, that a particular
> processor that you’re using is slower, due to a newer library perhaps, or
> code changes. Of note, I don’t think it’s necessarily more CPU intensive,
> if you’re still seeing a load average of only 3.5 - that’s quite low.
>
> My recommendation would be to run your test suite. Give it a good 15
> minutes or so to get into the thick of things. Then look at two things to
> determine where the bottleneck is. You’ll want to look for any backpressure
> first (the label on the Connection between processors would become red).
> That’s a dead giveaway of where the bottleneck is, if that’s kicking in.
> The next thing to check is going to the summary table (global menu, aka
> hamburger menu, and then Summary). Go to the processors tab and sort by
> task time. This will tell you which processors are taking the most time to
> run.
>
> In general, though, if backpressure is being applied, the destination of
> that connection is the bottleneck. If multiple connections in sequence have
> backpressure applied, look to the last one in the chain, as it’s causing
> the backpressure to propagate back. If there is no backpressure applied,
> then that means that your data flow is able to keep up with the rate of
> data that’s coming in. So that would imply that the source processor is not
> able to bring the data in as fast as you’d like. That could be due to NiFi
> (which would imply your disk is probably not fast enough, since clearly
> your CPU has more to give) or that the source of the data is not providing
> the data fast enough, etc. You could also try increasing the number of
> Concurrent Tasks on the source processor, and perhaps using a second thread
> will improve the performance.
>
> Thanks
> -Mark
>
>
> On Nov 24, 2020, at 5:40 PM, Eric Secules <[email protected]> wrote:
>
> Hi Mark,
>
> Watching the video now, and will plan to watch more of the series. Thanks!
> As for questions,
>
> I have NiFi on my macbook pro running in docker and give Docker VM 10 of
> my 12 cores and on a smaller test environment. I am seeing performance
> issues in both places. In my test environment we run it on a Standard D4s
> v3 (4 vcpus, 16 GiB memory) VM with a single 30GB Premium SSD (120 IOPS, 25
> Mbps). NiFi also runs in a docker container. Right now we use the standard
> number of thread pool threads (10). At any given time, even if I increase
> the number of threads in the pool I don't see the number of active
> processors go above 10. So I don't think increasing the size of the pool
> will help. My test VM has 4 cores and a load average of 3.5 over the past
> minute. And Azure monitoring shows me that the VM doesn't go above 50%
> average CPU usage while NiFi is under load. The disk is currently 70% full.
> Up until last month a full test suite would take about 30-40 minutes, and
> now it's pushing 4 hours. We started noticing tests taking a while shortly
> after upgrading NiFi to 1.12.1 from 1.11.4.
>
> We don't configure ridiculous amounts of concurrent tasks to processors.
> Is it possible that between 1.11.4 and 1.12.1 NiFi became a lot more CPU
> intensive?
>
> Thanks,
> Eric
>
>
>
> On Tue, Nov 24, 2020 at 6:55 AM Mark Payne <[email protected]> wrote:
>
>> Eric,
>>
>> I don’t think there’s really any metric that exposes the specific numbers
>> you’re looking for. Certainly you could run a Java profiler and look at the
>> results to see where all of the time is being spent. But that may get into
>> more details than you’re comfortable sorting through, depending on your
>> knowledge of Java, profilers, and nifi internals.
>>
>> The nifi.bored.yield.duration is definitely an important property when
>> you’ve got lots of processors that aren’t really doing anything. You can
>> increase that if you are okay adding potential latency into your dataflow.
>> That said, 10 milliseconds is the default and generally works quite well,
>> even with many thousands of processors. Of course, it also depends on how
>> many cpu cores you have, etc.
>>
>> As for whether or not increasing the number of timer-driven threads will
>> help, that very much depends on several things. How many threads are being
>> used? How many CPU cores do you have? How many are being used? There are a
>> series of videos on YouTube where I’ve discussed nifi anti-patterns. One of
>> those [1] discusses how to tune the Timer-Driven Thread Pool, which may be
>> helpful to you.
>>
>> Thanks
>> -Mark
>>
>> [1] https://www.youtube.com/watch?v=pZq0EbfDBy4
>>
>>
>> On Nov 23, 2020, at 7:55 PM, Eric Secules <[email protected]> wrote:
>>
>> Hello everyone,
>>
>> I was wondering if there was a metric for the amount of time tImer-driven
>> processors spend in a queue ready and waiting to be run. I use NiFi in an
>> atypical way and my flow has over 2000 processors running on a single node,
>> but there are usually less than 10 connections that have one or more
>> flowfiles in them at any given time.
>>
>> I have a theory that the number of processors in use is slowing down the
>> system overall. But I would need to see some more metrics to know whether
>> that's the case and tell whether anything I am doing is helping. Are there
>> some logs that I could look for or internal stats I could poke at with a
>> debugger?
>>
>> Should I be able to see increased throughput by increasing the number of
>> timer-driven threads, or is there a different mechanism responsible for
>> going through all the runnable processors to see whether they have input to
>> process. I also noticed "nifi.bored.yield.duration" would it be good to
>> increase the yield duration in this setting?
>>
>> Thanks,
>> Eric
>>
>>
>>
>

Re: Tuning for flow with lots of processors

Reply via email to