Re: Ghost Flowfiles in Loadbalancing connections

Phillip Lord Thu, 12 Dec 2024 08:41:04 -0800

Christian, FWIW... if you weren't already aware, the MergeContent has a
"defragment" strategy that can be applied.  This requires the files to have
been previously split within Nifi and necessary attributes applied.


Might not meet your needs... but just wanted to throw it out there!

Cheers,
-Phil-



On Wed, Dec 11, 2024 at 6:23 AM Christian Wahl <christian.w...@abusix.com>
wrote:

> Thanks, Joe, for the advice. That sounds plausible. Time to go bug hunting!
>
>
>
>
> On 10. Dec 2024, at 16:32, Joe Witt <joe.w...@gmail.com> wrote:
>
> Thanks - I suspect the custom component is grabbing flowfiles from the
> queue but not processing/handling them.  If you know there are 1,000
> flowfiles in the queue for instance but you cannot list them - that means
> the downstream processor has them in an active session.
>
> Thanks
>
> On Tue, Dec 10, 2024 at 3:52 PM Christian Wahl <christian.w...@abusix.com>
> wrote:
>
>> Hi Joe,
>>
>> It is a customer processor that will reconstruct a fragmented message. So
>> kinda similar to MergeContent.
>> It has the following annotations
>>
>> @TriggerWhenEmpty
>> @SupportsBatching(defaultDuration = DefaultRunDuration.TWO_SECONDS)
>> @DefaultSchedule(strategy = SchedulingStrategy.TIMER_DRIVEN, period = "2 
>> sec")
>> @DefaultSettings(yieldDuration = "2 sec")
>> @InputRequirement(InputRequirement.Requirement.INPUT_REQUIRED)
>>
>>
>> Unfortunately I don’t have a screenshot showing the problem. Next time it
>> happens, I will create some when it happens again.
>>
>> <Screenshot 2024-12-10 at 15.50.23.png>
>>
>> On 10. Dec 2024, at 14:28, Joe Witt <joe.w...@gmail.com> wrote:
>>
>> Christian
>>
>> What processor is following this?  (MergeContent/MergeRecord - something
>> else?)
>>
>> Please show an image of what you're seeing.
>>
>> Thanks
>>
>> On Tue, Dec 10, 2024 at 12:27 PM Christian Wahl <
>> christian.w...@abusix.com> wrote:
>>
>>> Hello everyone,
>>>
>>> We’re experiencing a peculiar issue with a load balancing connection.
>>> It’s partitioning files based on an attribute in a 6-node cluster with a
>>> dedicated Zookeeper cluster. We’re running version 1.27.0 and using a
>>> volatile Flowfile and Provenance repository.
>>>
>>> Occasionally, the connection breaks, and the queue becomes full with
>>> “ghost files.” This means the queue displays 10k Flowfiles per affected
>>> node (maximum back pressure), but the processor doesn’t process any of
>>> them. Additionally, right-clicking and viewing the queue list shows an
>>> empty list.
>>>
>>> As a result, the affected nodes accumulate back pressure, causing the
>>> flow to fill up and stop processing.
>>>
>>> A restart resolves the issue, but it’s not an ideal solution. So far, I
>>> haven’t been able to reproduce the problem.
>>>
>>> Has anyone experienced similar issues or has an idea about why this is
>>> happening?
>>>
>>> Thanks,
>>> Christian
>>
>>
>>
>

Re: Ghost Flowfiles in Loadbalancing connections

Reply via email to