Re: Back pressure deadlock

Andrew Grande Mon, 23 Oct 2017 07:40:14 -0700

I wonder which jms broker you are using. The situation where a jms
destination is full is absurd, the whole point was to decouple publishers
and consumers. I would additionally look into what jms broker settings are
available to address the situation.


Andrew

On Mon, Oct 23, 2017, 10:32 AM Arne Degenring <[email protected]>
wrote:

> Hi Mark,
>
> Don’t get me wrong, NiFi is great! Much appreciated that it is constantly
> being improved. Would be great if better support for looping connections
> would be one of those improvements in the future :-) In the meantime, we
> can live with one of the solutions you suggested. Thanks for describing the
> options!
>
> Keep up the good work!
> Arne
>
>
> On 23. Oct 2017, at 16:05, Mark Payne <[email protected]> wrote:
>
> Arne,
>
> Fair enough. NiFi could perhaps be smarter about looping connections
> instead of stopping at self-loops.
>
> Another approach to this situation, which I have used, though, would be
> rather than having a flow that loops like you laid out
> with PublishJMS -> LogAttribute -> Back to PublishJMS,
> you could instead connect the 'failure' relationship to both PublishJMS as
> a self-loop and also connect it to the LogAttribute (or alerting
> processor or whatever you have), and then set an age-off on that
> connection. So in this setup, even if the log/alerting processor
> was having trouble, you'd not cause back pressure to be applied to
> PublishJMS because of the age-off. Typically in such a situation,
> sending data to some sort of alerting/status publishing case, it is the
> case that age-off is appropriate (though granted it may not be 100%
> of the time).
>
> Another useful approach to consider in such a case may actually be to have
> Reporting Tasks [1] that would monitor the flow for large queues,
> etc. While you can build such monitoring capabilities into the flow, I am
> a fan personally of 'pulling up' this logic out of the flow because it tends
> to result in much cleaner, easier-to-understand, and easier-to-implement
> flows.
>
> So I'm certainly not saying that what NiFi does is correct and perfect and
> can't be improved upon - any solution can probably be improved upon,
> and NiFi is certainly rapidly improving each day. But I wanted to point
> out some ways that you can think about attacking the concerns that you
> have with the current implementation.
>
> Thanks!
> -Mark
>
>
> [1]
> http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Reporting_Tasks
>
>
>
> On Oct 23, 2017, at 9:45 AM, Arne Degenring <[email protected]>
> wrote:
>
> Hi Mark,
>
> Thanks for clarifying that self-looping connections will still be
> processed in back pressure situations.
>
> For this specific case, we can probably live without the additional
> routing to the logging component and back.
>
> I think, however, that there are cases when such ping-pong routing in
> failure cases can be very useful. E.g. for alerting someone actively,
> publishing some information on a status page, ... etc.
>
> Therefore I feel it would be great if NiFi could be extended to avoid such
> back pressure deadlock situations. Maybe through some kind of automatic
> deadlock detection, or by marking certain incoming relations as not back
> pressure relevant (same as self-looping connections).
>
> Thanks,
> Arne
>
>
> On 23. Oct 2017, at 15:00, Mark Payne <[email protected]> wrote:
>
> Hi Arne,
>
> Generally, the approach that is used in such a situation would be to route
> failure back to the PublishJMS processor
> itself (without diverting first to a LogAttribute processor). The
> PublishJMS processors itself should be logging an error
> with the FlowFile's identity. Then, troubleshooting can be done by
> inspecting the queue (right-click, List Queue) or
> via Data Provenance [1]. When a processor encounters backpressure, it
> still will continue to process data that comes
> in on self-looping connections. So the failure relationship would still
> get processed.
>
> Does this help?
>
> Thanks
> -Mark
>
>
>
> [1]
> http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data_provenance
>
>
>
> On Oct 23, 2017, at 6:46 AM, Arne Degenring <[email protected]>
> wrote:
>
> Hi,
>
> We came across a situation when we experience a kind of “back pressure
> dead lock”.
>
> In our setup, this occurs around PublishJMS when the target JMS queue is
> full. Please find attached a screenshot of the relevant flow.
>
> The failure relation we route to a logging component, and then back to
> PublishJMS for retry. Sooner or later, the failure and retry queues will
> become full and produce backpressure towards the main input (which is
> good). The problem is that the same back pressure is also applied to the
> retry queue.
>
> In this situation, PublishJMS will not be called at all any longer. Even
> when the JMS problem resolves, the whole thing stays deadlocked.
>
> Is there a recommended way to avoid such situation?
>
> Obviously, an admin can temporarily increase the back pressure threshold
> of the failure connection, once the JMS problem is resolved. But it would
> be nicer if the problem could resolve automatically, i.e. PublishJMS should
> keep retrying somehow.
>
> Any hints?
>
> Thanks,
> Arne
>
>
>
> <backpressure-deadlock.png>
>
>
>
>

Re: Back pressure deadlock

Reply via email to