Re: possible load balancing issue

Joe Witt Fri, 10 Jun 2022 11:53:14 -0700

Mark

I will be a few weeks before you can evaluate this?


thanks

On Fri, Jun 10, 2022 at 11:03 AM Joe Witt <joe.w...@gmail.com> wrote:

> MarkB
>
> That is why MarkP said it was a manifestation.  Point is the issue you
> noted, specifically the behavior you saw here (and before) is believed to
> be addressed in that fix which went into the release 6 months ago and is
> also in the 1.16.x line.  You'll want that and of course the many other
> improvements to have improved behavior for this scenario.
>
> Thanks
>
> On Fri, Jun 10, 2022 at 10:59 AM Mark Bean <mark.o.b...@gmail.com> wrote:
>
>> This is not quite the same issue. It's possible the fix for NIFI-9433 may
>> be related. But, the set of circumstances are definitely different. Also,
>> the observed behavior is different. For example, none of the nodes report
>> "
>> Cannot create negative queue size".
>>
>> I'm trying to track specific FlowFile(s) from one node to another during
>> load balancing. And, I have been unsuccessful. In other words, I have not
>> been able to confirm whether a given FlowFile was successfully transferred
>> or not. Provenance is no longer available for this time period. I know,
>> not
>> good answers for diagnosing the issue.
>>
>> My real question is what is the expected behavior for FlowFiles that are
>> actively load balancing and the cluster is shutdown?
>>
>> We have plans to upgrade as soon as possible, but unfortunately, that will
>> not be for at least a few more weeks due to the need to integrate custom
>> changes into 1.16.2.
>>
>>
>> On Fri, Jun 10, 2022 at 1:31 PM Mark Payne <marka...@hotmail.com> wrote:
>>
>> > Mark,
>> >
>> > This is a manifestation of NIFI-9433 [1] that we fixed a while back.
>> > Recommend you upgrade your installation.
>> >
>> > Thanks
>> > -Mark
>> >
>> >
>> > [1] https://issues.apache.org/jira/browse/NIFI-9433
>> >
>> >
>> > On Jun 10, 2022, at 1:16 PM, Mark Bean <mark.o.b...@gmail.com<mailto:
>> > mark.o.b...@gmail.com>> wrote:
>> >
>> > We have a situation where several flowfiles have lost their content.
>> They
>> > still appear on the graph, but any attempt by a processor to access
>> content
>> > results in a NullPointerException. The identified content claim file is
>> in
>> > fact missing from the file system.
>> >
>> > Also, there are ERROR log messages indicating the claimant count is a
>> > negative value.
>> >
>> > o.a.n.c.r.c.StandardResourceClaimManager Decremented claimant count for
>> > StandardResourceClaim[id=1234-567, containter=default, section=890] to
>> -1
>> >
>> > (There are also some with negative values as low as -4.)
>> >
>> > Anecdotally, we are suspecting this may have been caused by incomplete
>> > connection load balance. And, if this is the case, it is not clear if
>> the
>> > content successfully reached another Node and the FlowFile simply didn't
>> > finish cleaning up, or if content was prematurely dropped.
>> >
>> > It should be noted that the cluster was upgraded/restarted at or about
>> the
>> > time the errors started. Could a shutdown of NiFi cause data loss if a
>> load
>> > balance was currently in progress?
>> >
>> > NiFi 1.14.0
>> >
>> > Thanks,
>> > Mark
>> >
>> >
>>
>

Re: possible load balancing issue

Reply via email to