Eric, There’s nothing that I know of that went into 1.12.1 that would cause the dataflow to be any slower. And I’d expect to have heard about it from others if there were. There is a chance, though, that a particular processor that you’re using is slower, due to a newer library perhaps, or code changes. Of note, I don’t think it’s necessarily more CPU intensive, if you’re still seeing a load average of only 3.5 - that’s quite low.
My recommendation would be to run your test suite. Give it a good 15 minutes or so to get into the thick of things. Then look at two things to determine where the bottleneck is. You’ll want to look for any backpressure first (the label on the Connection between processors would become red). That’s a dead giveaway of where the bottleneck is, if that’s kicking in. The next thing to check is going to the summary table (global menu, aka hamburger menu, and then Summary). Go to the processors tab and sort by task time. This will tell you which processors are taking the most time to run. In general, though, if backpressure is being applied, the destination of that connection is the bottleneck. If multiple connections in sequence have backpressure applied, look to the last one in the chain, as it’s causing the backpressure to propagate back. If there is no backpressure applied, then that means that your data flow is able to keep up with the rate of data that’s coming in. So that would imply that the source processor is not able to bring the data in as fast as you’d like. That could be due to NiFi (which would imply your disk is probably not fast enough, since clearly your CPU has more to give) or that the source of the data is not providing the data fast enough, etc. You could also try increasing the number of Concurrent Tasks on the source processor, and perhaps using a second thread will improve the performance. Thanks -Mark On Nov 24, 2020, at 5:40 PM, Eric Secules <[email protected]<mailto:[email protected]>> wrote: Hi Mark, Watching the video now, and will plan to watch more of the series. Thanks! As for questions, I have NiFi on my macbook pro running in docker and give Docker VM 10 of my 12 cores and on a smaller test environment. I am seeing performance issues in both places. In my test environment we run it on a Standard D4s v3 (4 vcpus, 16 GiB memory) VM with a single 30GB Premium SSD (120 IOPS, 25 Mbps). NiFi also runs in a docker container. Right now we use the standard number of thread pool threads (10). At any given time, even if I increase the number of threads in the pool I don't see the number of active processors go above 10. So I don't think increasing the size of the pool will help. My test VM has 4 cores and a load average of 3.5 over the past minute. And Azure monitoring shows me that the VM doesn't go above 50% average CPU usage while NiFi is under load. The disk is currently 70% full. Up until last month a full test suite would take about 30-40 minutes, and now it's pushing 4 hours. We started noticing tests taking a while shortly after upgrading NiFi to 1.12.1 from 1.11.4. We don't configure ridiculous amounts of concurrent tasks to processors. Is it possible that between 1.11.4 and 1.12.1 NiFi became a lot more CPU intensive? Thanks, Eric On Tue, Nov 24, 2020 at 6:55 AM Mark Payne <[email protected]<mailto:[email protected]>> wrote: Eric, I don’t think there’s really any metric that exposes the specific numbers you’re looking for. Certainly you could run a Java profiler and look at the results to see where all of the time is being spent. But that may get into more details than you’re comfortable sorting through, depending on your knowledge of Java, profilers, and nifi internals. The nifi.bored.yield.duration is definitely an important property when you’ve got lots of processors that aren’t really doing anything. You can increase that if you are okay adding potential latency into your dataflow. That said, 10 milliseconds is the default and generally works quite well, even with many thousands of processors. Of course, it also depends on how many cpu cores you have, etc. As for whether or not increasing the number of timer-driven threads will help, that very much depends on several things. How many threads are being used? How many CPU cores do you have? How many are being used? There are a series of videos on YouTube where I’ve discussed nifi anti-patterns. One of those [1] discusses how to tune the Timer-Driven Thread Pool, which may be helpful to you. Thanks -Mark [1] https://www.youtube.com/watch?v=pZq0EbfDBy4 On Nov 23, 2020, at 7:55 PM, Eric Secules <[email protected]<mailto:[email protected]>> wrote: Hello everyone, I was wondering if there was a metric for the amount of time tImer-driven processors spend in a queue ready and waiting to be run. I use NiFi in an atypical way and my flow has over 2000 processors running on a single node, but there are usually less than 10 connections that have one or more flowfiles in them at any given time. I have a theory that the number of processors in use is slowing down the system overall. But I would need to see some more metrics to know whether that's the case and tell whether anything I am doing is helping. Are there some logs that I could look for or internal stats I could poke at with a debugger? Should I be able to see increased throughput by increasing the number of timer-driven threads, or is there a different mechanism responsible for going through all the runnable processors to see whether they have input to process. I also noticed "nifi.bored.yield.duration" would it be good to increase the yield duration in this setting? Thanks, Eric
