Re: Tuning for flow with lots of processors

Mark Payne Wed, 25 Nov 2020 06:31:54 -0800

Eric,

There’s nothing that I know of that went into 1.12.1 that would cause the 
dataflow to be any slower. And I’d expect to have heard about it from others if 
there were. There is a chance, though, that a particular processor that you’re 
using is slower, due to a newer library perhaps, or code changes. Of note, I 
don’t think it’s necessarily more CPU intensive, if you’re still seeing a load 
average of only 3.5 - that’s quite low.


My recommendation would be to run your test suite. Give it a good 15 minutes or 
so to get into the thick of things. Then look at two things to determine where 
the bottleneck is. You’ll want to look for any backpressure first (the label on 
the Connection between processors would become red). That’s a dead giveaway of 
where the bottleneck is, if that’s kicking in. The next thing to check is going 
to the summary table (global menu, aka hamburger menu, and then Summary). Go to 
the processors tab and sort by task time. This will tell you which processors 
are taking the most time to run.

In general, though, if backpressure is being applied, the destination of that 
connection is the bottleneck. If multiple connections in sequence have 
backpressure applied, look to the last one in the chain, as it’s causing the 
backpressure to propagate back. If there is no backpressure applied, then that 
means that your data flow is able to keep up with the rate of data that’s 
coming in. So that would imply that the source processor is not able to bring 
the data in as fast as you’d like. That could be due to NiFi (which would imply 
your disk is probably not fast enough, since clearly your CPU has more to give) 
or that the source of the data is not providing the data fast enough, etc. You 
could also try increasing the number of Concurrent Tasks on the source 
processor, and perhaps using a second thread will improve the performance.

Thanks
-Mark


On Nov 24, 2020, at 5:40 PM, Eric Secules 
<[email protected]<mailto:[email protected]>> wrote:

Hi Mark,

Watching the video now, and will plan to watch more of the series. Thanks! As 
for questions,

I have NiFi on my macbook pro running in docker and give Docker VM 10 of my 12 
cores and on a smaller test environment. I am seeing performance issues in both 
places. In my test environment we run it on a Standard D4s v3 (4 vcpus, 16 GiB 
memory) VM with a single 30GB Premium SSD (120 IOPS, 25 Mbps). NiFi also runs 
in a docker container. Right now we use the standard number of thread pool 
threads (10). At any given time, even if I increase the number of threads in 
the pool I don't see the number of active processors go above 10. So I don't 
think increasing the size of the pool will help. My test VM has 4 cores and a 
load average of 3.5 over the past minute. And Azure monitoring shows me that 
the VM doesn't go above 50% average CPU usage while NiFi is under load. The 
disk is currently 70% full. Up until last month a full test suite would take 
about 30-40 minutes, and now it's pushing 4 hours. We started noticing tests 
taking a while shortly after upgrading NiFi to 1.12.1 from 1.11.4.

We don't configure ridiculous amounts of concurrent tasks to processors. Is it 
possible that between 1.11.4 and 1.12.1 NiFi became a lot more CPU intensive?

Thanks,
Eric



On Tue, Nov 24, 2020 at 6:55 AM Mark Payne 
<[email protected]<mailto:[email protected]>> wrote:
Eric,

I don’t think there’s really any metric that exposes the specific numbers 
you’re looking for. Certainly you could run a Java profiler and look at the 
results to see where all of the time is being spent. But that may get into more 
details than you’re comfortable sorting through, depending on your knowledge of 
Java, profilers, and nifi internals.

The nifi.bored.yield.duration is definitely an important property when you’ve 
got lots of processors that aren’t really doing anything. You can increase that 
if you are okay adding potential latency into your dataflow. That said, 10 
milliseconds is the default and generally works quite well, even with many 
thousands of processors. Of course, it also depends on how many cpu cores you 
have, etc.

As for whether or not increasing the number of timer-driven threads will help, 
that very much depends on several things. How many threads are being used? How 
many CPU cores do you have? How many are being used? There are a series of 
videos on YouTube where I’ve discussed nifi anti-patterns. One of those [1] 
discusses how to tune the Timer-Driven Thread Pool, which may be helpful to you.

Thanks
-Mark

[1] https://www.youtube.com/watch?v=pZq0EbfDBy4


On Nov 23, 2020, at 7:55 PM, Eric Secules 
<[email protected]<mailto:[email protected]>> wrote:

Hello everyone,

I was wondering if there was a metric for the amount of time tImer-driven 
processors spend in a queue ready and waiting to be run. I use NiFi in an 
atypical way and my flow has over 2000 processors running on a single node, but 
there are usually less than 10 connections that have one or more flowfiles in 
them at any given time.

I have a theory that the number of processors in use is slowing down the system 
overall. But I would need to see some more metrics to know whether that's the 
case and tell whether anything I am doing is helping. Are there some logs that 
I could look for or internal stats I could poke at with a debugger?

Should I be able to see increased throughput by increasing the number of 
timer-driven threads, or is there a different mechanism responsible for going 
through all the runnable processors to see whether they have input to process. 
I also noticed "nifi.bored.yield.duration" would it be good to increase the 
yield duration in this setting?

Thanks,
Eric

Re: Tuning for flow with lots of processors

Reply via email to