Great! Glad you were able to get it figured out. And thanks for following up!

Sent from my iPhone

On Nov 26, 2020, at 12:50 PM, Eric Secules <esecu...@gmail.com> wrote:


Hi Mark,

It was because the main disk was filling up! We increased the disk size to 
128GB and speed improved!

Thanks,
Eric

On Wed., Nov. 25, 2020, 12:34 p.m. Eric Secules, 
<esecu...@gmail.com<mailto:esecu...@gmail.com>> wrote:
Hi Mark,

Thanks for the quick response, I grepped the logs and did find several hits! I 
will try increasing the disk space from 30GB to 128GB and hopefully that will 
speed things up.

2020-11-25 19:33:50,416 INFO [Timer-Driven Process Thread-4] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:37:10,649 INFO [Timer-Driven Process Thread-9] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:37:10,877 INFO [Timer-Driven Process Thread-3] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:37:20,376 INFO [Timer-Driven Process Thread-2] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:50:11,195 INFO [Timer-Driven Process Thread-3] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:50:22,974 INFO [Timer-Driven Process Thread-6] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:50:30,002 INFO [Timer-Driven Process Thread-7] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:53:31,591 INFO [Timer-Driven Process Thread-6] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:54:00,707 INFO [Timer-Driven Process Thread-4] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:54:10,016 INFO [Timer-Driven Process Thread-2] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:54:11,148 INFO [Timer-Driven Process Thread-3] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 19:54:26,104 INFO [Timer-Driven Process Thread-5] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
tail: '/opt/nifi/nifi-current/logs/nifi-app.log' has been replaced;  following 
new file
2020-11-25 20:01:27,376 INFO [Timer-Driven Process Thread-10] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:08:06,446 INFO [Timer-Driven Process Thread-6] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:08:06,485 INFO [Timer-Driven Process Thread-8] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:08:07,354 INFO [Timer-Driven Process Thread-3] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:08:10,001 INFO [Timer-Driven Process Thread-2] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:08:10,816 INFO [Timer-Driven Process Thread-10] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:14:11,145 INFO [Timer-Driven Process Thread-9] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:14:11,150 INFO [Timer-Driven Process Thread-6] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:14:11,157 INFO [Timer-Driven Process Thread-4] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:14:20,002 INFO [Timer-Driven Process Thread-3] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:17:06,638 INFO [Timer-Driven Process Thread-9] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:17:06,807 INFO [Timer-Driven Process Thread-4] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:17:06,909 INFO [Timer-Driven Process Thread-8] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:17:25,955 INFO [Timer-Driven Process Thread-1] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:21:18,652 INFO [Timer-Driven Process Thread-10] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:21:20,002 INFO [Timer-Driven Process Thread-5] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:22:47,868 INFO [Timer-Driven Process Thread-7] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:22:48,224 INFO [Timer-Driven Process Thread-2] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:22:48,225 INFO [Timer-Driven Process Thread-8] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:22:49,451 INFO [Timer-Driven Process Thread-1] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:26:13,592 INFO [Timer-Driven Process Thread-7] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:26:13,752 INFO [Timer-Driven Process Thread-3] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:26:14,363 INFO [Timer-Driven Process Thread-2] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup
2020-11-25 20:26:20,003 INFO [Timer-Driven Process Thread-4] 
o.a.n.c.repository.FileSystemRepository Unable to write to container default 
due to archive file size constraints; waiting for archive cleanup


-Eric

On Wed, Nov 25, 2020 at 12:28 PM Mark Payne 
<marka...@hotmail.com<mailto:marka...@hotmail.com>> wrote:
Eric,

As I was reading through your response, I was going to ask about how much of 
the storage space is used and what the value of 
nifi.content.repository.archive.max.usage.percentage is set to.  If you look in 
the logs I’m guessing you’ll see some logs like "Unable to write to container 
XYZ due to archive file size constraints; waiting for archive cleanup”

If you’re seeing that, what it’s basically tell you is that the Content 
Repository is applying backpressure to prevent you from running out of disk 
space. If you set the “nifi.content.repository.archive.max.usage.percentage” 
property to say “90%” you’ll probably see the performance improve and avoid the 
sporadic conditions that you’re seeing. But depending on the bursty-ness of 
your data, at 90% you could potentially risk running out of disk space. Of 
course if you’re already sitting at 79% you may also want to just increase the 
amount of disk space that you have.

Thanks
-Mark


> On Nov 25, 2020, at 3:21 PM, Eric Secules 
> <esecu...@gmail.com<mailto:esecu...@gmail.com>> wrote:
>
> Thanks for the tips Mark!
>
> I looked at the summary and there are a fair number of processors at the top 
> of the list which create flowfiles from an outside source, but there are also 
> several anomalies. For example there are some processors that are mid-flow 
> which usually process each flowfile in less than a second, but sometimes 
> average execution time balloons to several seconds and don't correlate with 
> flowfile size. I see this behavior on network IO bound processors (most IO is 
> done to docker containers on the same host), and even data processors like 
> ReplaceText (full text regex) where I saw execution time go up to 30 seconds 
> even though the input file size and content is always the same(22.5KB) It 
> usually takes a couple milliseconds to process the same file.
>
> I see no backpressure in the flow, but depending on when I look at the 
> summary I don't always see the anomalies I mentioned above, sometimes it 
> looks totally acceptable. But other times I see processors like ReplaceText 
> (which only has one concurrent task) be active for 4 of the past 5 minutes.
>
> I tried looking at the disk metrics in Azure, and it says we aren't near our 
> quota of 120 IOPS and we do have burst capacity of up to 3100 IOPS. During 
> testing we are steady at about 20 IOPS.
>
> All 3 file repositories (content, provenance, flow file) are stored on the OS 
> disk which is at 79% capacity. Could constant pruning of the content repo be 
> the cause of the intermittent slowness? We have the following setting: 
> nifi.content.repository.archive.max.usage.percentage=50%
>
> Thanks,
> Eric
>
> On Wed, Nov 25, 2020 at 6:31 AM Mark Payne 
> <marka...@hotmail.com<mailto:marka...@hotmail.com>> wrote:
> Eric,
>
> There’s nothing that I know of that went into 1.12.1 that would cause the 
> dataflow to be any slower. And I’d expect to have heard about it from others 
> if there were. There is a chance, though, that a particular processor that 
> you’re using is slower, due to a newer library perhaps, or code changes. Of 
> note, I don’t think it’s necessarily more CPU intensive, if you’re still 
> seeing a load average of only 3.5 - that’s quite low.
>
> My recommendation would be to run your test suite. Give it a good 15 minutes 
> or so to get into the thick of things. Then look at two things to determine 
> where the bottleneck is. You’ll want to look for any backpressure first (the 
> label on the Connection between processors would become red). That’s a dead 
> giveaway of where the bottleneck is, if that’s kicking in. The next thing to 
> check is going to the summary table (global menu, aka hamburger menu, and 
> then Summary). Go to the processors tab and sort by task time. This will tell 
> you which processors are taking the most time to run.
>
> In general, though, if backpressure is being applied, the destination of that 
> connection is the bottleneck. If multiple connections in sequence have 
> backpressure applied, look to the last one in the chain, as it’s causing the 
> backpressure to propagate back. If there is no backpressure applied, then 
> that means that your data flow is able to keep up with the rate of data 
> that’s coming in. So that would imply that the source processor is not able 
> to bring the data in as fast as you’d like. That could be due to NiFi (which 
> would imply your disk is probably not fast enough, since clearly your CPU has 
> more to give) or that the source of the data is not providing the data fast 
> enough, etc. You could also try increasing the number of Concurrent Tasks on 
> the source processor, and perhaps using a second thread will improve the 
> performance.
>
> Thanks
> -Mark
>
>
>> On Nov 24, 2020, at 5:40 PM, Eric Secules 
>> <esecu...@gmail.com<mailto:esecu...@gmail.com>> wrote:
>>
>> Hi Mark,
>>
>> Watching the video now, and will plan to watch more of the series. Thanks! 
>> As for questions,
>>
>> I have NiFi on my macbook pro running in docker and give Docker VM 10 of my 
>> 12 cores and on a smaller test environment. I am seeing performance issues 
>> in both places. In my test environment we run it on a Standard D4s v3 (4 
>> vcpus, 16 GiB memory) VM with a single 30GB Premium SSD (120 IOPS, 25 Mbps). 
>> NiFi also runs in a docker container. Right now we use the standard number 
>> of thread pool threads (10). At any given time, even if I increase the 
>> number of threads in the pool I don't see the number of active processors go 
>> above 10. So I don't think increasing the size of the pool will help. My 
>> test VM has 4 cores and a load average of 3.5 over the past minute. And 
>> Azure monitoring shows me that the VM doesn't go above 50% average CPU usage 
>> while NiFi is under load. The disk is currently 70% full. Up until last 
>> month a full test suite would take about 30-40 minutes, and now it's pushing 
>> 4 hours. We started noticing tests taking a while shortly after upgrading 
>> NiFi to 1.12.1 from 1.11.4.
>>
>> We don't configure ridiculous amounts of concurrent tasks to processors. Is 
>> it possible that between 1.11.4 and 1.12.1 NiFi became a lot more CPU 
>> intensive?
>>
>> Thanks,
>> Eric
>>
>>
>>
>> On Tue, Nov 24, 2020 at 6:55 AM Mark Payne 
>> <marka...@hotmail.com<mailto:marka...@hotmail.com>> wrote:
>> Eric,
>>
>> I don’t think there’s really any metric that exposes the specific numbers 
>> you’re looking for. Certainly you could run a Java profiler and look at the 
>> results to see where all of the time is being spent. But that may get into 
>> more details than you’re comfortable sorting through, depending on your 
>> knowledge of Java, profilers, and nifi internals.
>>
>> The nifi.bored.yield.duration is definitely an important property when 
>> you’ve got lots of processors that aren’t really doing anything. You can 
>> increase that if you are okay adding potential latency into your dataflow. 
>> That said, 10 milliseconds is the default and generally works quite well, 
>> even with many thousands of processors. Of course, it also depends on how 
>> many cpu cores you have, etc.
>>
>> As for whether or not increasing the number of timer-driven threads will 
>> help, that very much depends on several things. How many threads are being 
>> used? How many CPU cores do you have? How many are being used? There are a 
>> series of videos on YouTube where I’ve discussed nifi anti-patterns. One of 
>> those [1] discusses how to tune the Timer-Driven Thread Pool, which may be 
>> helpful to you.
>>
>> Thanks
>> -Mark
>>
>> [1] https://www.youtube.com/watch?v=pZq0EbfDBy4
>>
>>
>>> On Nov 23, 2020, at 7:55 PM, Eric Secules 
>>> <esecu...@gmail.com<mailto:esecu...@gmail.com>> wrote:
>>>
>>> Hello everyone,
>>>
>>> I was wondering if there was a metric for the amount of time tImer-driven 
>>> processors spend in a queue ready and waiting to be run. I use NiFi in an 
>>> atypical way and my flow has over 2000 processors running on a single node, 
>>> but there are usually less than 10 connections that have one or more 
>>> flowfiles in them at any given time.
>>>
>>> I have a theory that the number of processors in use is slowing down the 
>>> system overall. But I would need to see some more metrics to know whether 
>>> that's the case and tell whether anything I am doing is helping. Are there 
>>> some logs that I could look for or internal stats I could poke at with a 
>>> debugger?
>>>
>>> Should I be able to see increased throughput by increasing the number of 
>>> timer-driven threads, or is there a different mechanism responsible for 
>>> going through all the runnable processors to see whether they have input to 
>>> process. I also noticed "nifi.bored.yield.duration" would it be good to 
>>> increase the yield duration in this setting?
>>>
>>> Thanks,
>>> Eric
>>
>

Reply via email to