Yes, it will be a few weeks at least to get the upgrade into the
environment where we see this occurring and eavluate. Part of the problem
is reproducibility. We haven't yet created a scenario that reliably forces
this situation. That's on the list for Monday though. If we can reliably
reproduce, I'm sure we can test much sooner with 1.16.2 to confirm it's
been addressed - even if we don't yet have 1.16.2 in the target environment.

Will report back findings when available.

Thanks,
Mark


On Fri, Jun 10, 2022 at 2:53 PM Joe Witt <joe.w...@gmail.com> wrote:

> Mark
>
> I will be a few weeks before you can evaluate this?
>
> thanks
>
> On Fri, Jun 10, 2022 at 11:03 AM Joe Witt <joe.w...@gmail.com> wrote:
>
> > MarkB
> >
> > That is why MarkP said it was a manifestation.  Point is the issue you
> > noted, specifically the behavior you saw here (and before) is believed to
> > be addressed in that fix which went into the release 6 months ago and is
> > also in the 1.16.x line.  You'll want that and of course the many other
> > improvements to have improved behavior for this scenario.
> >
> > Thanks
> >
> > On Fri, Jun 10, 2022 at 10:59 AM Mark Bean <mark.o.b...@gmail.com>
> wrote:
> >
> >> This is not quite the same issue. It's possible the fix for NIFI-9433
> may
> >> be related. But, the set of circumstances are definitely different.
> Also,
> >> the observed behavior is different. For example, none of the nodes
> report
> >> "
> >> Cannot create negative queue size".
> >>
> >> I'm trying to track specific FlowFile(s) from one node to another during
> >> load balancing. And, I have been unsuccessful. In other words, I have
> not
> >> been able to confirm whether a given FlowFile was successfully
> transferred
> >> or not. Provenance is no longer available for this time period. I know,
> >> not
> >> good answers for diagnosing the issue.
> >>
> >> My real question is what is the expected behavior for FlowFiles that are
> >> actively load balancing and the cluster is shutdown?
> >>
> >> We have plans to upgrade as soon as possible, but unfortunately, that
> will
> >> not be for at least a few more weeks due to the need to integrate custom
> >> changes into 1.16.2.
> >>
> >>
> >> On Fri, Jun 10, 2022 at 1:31 PM Mark Payne <marka...@hotmail.com>
> wrote:
> >>
> >> > Mark,
> >> >
> >> > This is a manifestation of NIFI-9433 [1] that we fixed a while back.
> >> > Recommend you upgrade your installation.
> >> >
> >> > Thanks
> >> > -Mark
> >> >
> >> >
> >> > [1] https://issues.apache.org/jira/browse/NIFI-9433
> >> >
> >> >
> >> > On Jun 10, 2022, at 1:16 PM, Mark Bean <mark.o.b...@gmail.com<mailto:
> >> > mark.o.b...@gmail.com>> wrote:
> >> >
> >> > We have a situation where several flowfiles have lost their content.
> >> They
> >> > still appear on the graph, but any attempt by a processor to access
> >> content
> >> > results in a NullPointerException. The identified content claim file
> is
> >> in
> >> > fact missing from the file system.
> >> >
> >> > Also, there are ERROR log messages indicating the claimant count is a
> >> > negative value.
> >> >
> >> > o.a.n.c.r.c.StandardResourceClaimManager Decremented claimant count
> for
> >> > StandardResourceClaim[id=1234-567, containter=default, section=890] to
> >> -1
> >> >
> >> > (There are also some with negative values as low as -4.)
> >> >
> >> > Anecdotally, we are suspecting this may have been caused by incomplete
> >> > connection load balance. And, if this is the case, it is not clear if
> >> the
> >> > content successfully reached another Node and the FlowFile simply
> didn't
> >> > finish cleaning up, or if content was prematurely dropped.
> >> >
> >> > It should be noted that the cluster was upgraded/restarted at or about
> >> the
> >> > time the errors started. Could a shutdown of NiFi cause data loss if a
> >> load
> >> > balance was currently in progress?
> >> >
> >> > NiFi 1.14.0
> >> >
> >> > Thanks,
> >> > Mark
> >> >
> >> >
> >>
> >
>

Reply via email to