Hi Joe,

It's a pretty fixed size objects at a fixed interval- One 5mb-ish file, we
break down to individual rows.

I went so far as to create a "stress test" where I have a generateFlow(
creating a fix, 100k fille, in batches of 1000, every .1s) feeding right
into a putFile. I wanted to see the sustained max. It was very stable, fast
for over a week running - but now it's extremely slow. That was able as
simple of a data flow I could think of to hit all the different resources
(CPU, memory

I was thinking too, maybe it was memory but it's slow right at the start
when starting NiFi. I would expect the memory to cause it to be slower over
time, and the stress test showed it wasn't something that was fluenting
over time.

I'm happy to make other flows that anyone can suggest to help troubleshoot,
diagnose issue.

Lars,

We haven't changed it between when performance was good and now when it's
slow. That is what is throwing me - nothing changed from NiFi configuration
standby.
My guess is we are having some throttling/resource contention from our
provider but I can't determine what/where/how. The Grafana cluster
dashboards I have don't indicate issues. If there are suggestions for
specific cluster metrics to plot/dashboards to use, I'm happy to build them
and contribute them back (I do have a dashboard I need to figure out how to
share for creating the "status history" plots in Grafana).
The repos aren't full and I tried even blowing them away just to see if
that made a difference.
I'm not seeing anything new in the logs that indicate an issue...but maybe
I'm missing it so I will try to look again

By chance, are there any low level debugging metrics/observability/etc that
would show how long things like writing to the repository disks is taking?
There is a part of me that feels this could be a Disk I/O resource issue
but I don't know how I can verify that is/isn't the issue.

Thank you all for the help and suggestions - please keep them coming as I'm
grasping at straws right now.

-Aaron


On Wed, Jan 10, 2024 at 10:10 AM Joe Witt <[email protected]> wrote:

> Aaron,
>
> The usual suspects are memory consumption leading to high GC leading to
> lower performance over time, or back pressure in the flow, etc.. But your
> description does not really fit either exactly.  Does your flow see a mix
> of large objects and smaller objects?
>
> Thanks
>
> On Wed, Jan 10, 2024 at 10:07 AM Aaron Rich <[email protected]> wrote:
>
>> Hi all,
>>
>>
>>
>> I’m running into an odd issue and hoping someone can point me in the
>> right direction.
>>
>>
>>
>> I have NiFi 1.19 deployed in a Kube cluster with all the repositories
>> volume mounted out. It was processing great with processors like
>> UpdateAttribute sending through 15K/5m PutFile sending through 3K/5m.
>>
>>
>>
>> With nothing changing in the deployment, the performance has dropped to
>> UpdateAttribute doing 350/5m and Putfile to 200/5m.
>>
>>
>>
>> I’m trying to determine what resource is suddenly dropping our
>> performance like this. I don’t see anything on the Kube monitoring that
>> stands out and I have restarted, cleaned repos, changed nodes but nothing
>> is helping.
>>
>>
>>
>> I was hoping there is something from the NiFi POV that can help identify
>> the limiting resource. I'm not sure if there is additional
>> diagnostic/debug/etc information available beyond the node status graphs.
>>
>>
>>
>> Any help would be greatly appreciated.
>>
>>
>>
>> Thanks.
>>
>>
>>
>> -Aaron
>>
>

Reply via email to