Arne, Fair enough. NiFi could perhaps be smarter about looping connections instead of stopping at self-loops.
Another approach to this situation, which I have used, though, would be rather than having a flow that loops like you laid out with PublishJMS -> LogAttribute -> Back to PublishJMS, you could instead connect the 'failure' relationship to both PublishJMS as a self-loop and also connect it to the LogAttribute (or alerting processor or whatever you have), and then set an age-off on that connection. So in this setup, even if the log/alerting processor was having trouble, you'd not cause back pressure to be applied to PublishJMS because of the age-off. Typically in such a situation, sending data to some sort of alerting/status publishing case, it is the case that age-off is appropriate (though granted it may not be 100% of the time). Another useful approach to consider in such a case may actually be to have Reporting Tasks [1] that would monitor the flow for large queues, etc. While you can build such monitoring capabilities into the flow, I am a fan personally of 'pulling up' this logic out of the flow because it tends to result in much cleaner, easier-to-understand, and easier-to-implement flows. So I'm certainly not saying that what NiFi does is correct and perfect and can't be improved upon - any solution can probably be improved upon, and NiFi is certainly rapidly improving each day. But I wanted to point out some ways that you can think about attacking the concerns that you have with the current implementation. Thanks! -Mark [1] http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#Reporting_Tasks On Oct 23, 2017, at 9:45 AM, Arne Degenring <[email protected]<mailto:[email protected]>> wrote: Hi Mark, Thanks for clarifying that self-looping connections will still be processed in back pressure situations. For this specific case, we can probably live without the additional routing to the logging component and back. I think, however, that there are cases when such ping-pong routing in failure cases can be very useful. E.g. for alerting someone actively, publishing some information on a status page, ... etc. Therefore I feel it would be great if NiFi could be extended to avoid such back pressure deadlock situations. Maybe through some kind of automatic deadlock detection, or by marking certain incoming relations as not back pressure relevant (same as self-looping connections). Thanks, Arne On 23. Oct 2017, at 15:00, Mark Payne <[email protected]<mailto:[email protected]>> wrote: Hi Arne, Generally, the approach that is used in such a situation would be to route failure back to the PublishJMS processor itself (without diverting first to a LogAttribute processor). The PublishJMS processors itself should be logging an error with the FlowFile's identity. Then, troubleshooting can be done by inspecting the queue (right-click, List Queue) or via Data Provenance [1]. When a processor encounters backpressure, it still will continue to process data that comes in on self-looping connections. So the failure relationship would still get processed. Does this help? Thanks -Mark [1] http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#data_provenance On Oct 23, 2017, at 6:46 AM, Arne Degenring <[email protected]<mailto:[email protected]>> wrote: Hi, We came across a situation when we experience a kind of “back pressure dead lock”. In our setup, this occurs around PublishJMS when the target JMS queue is full. Please find attached a screenshot of the relevant flow. The failure relation we route to a logging component, and then back to PublishJMS for retry. Sooner or later, the failure and retry queues will become full and produce backpressure towards the main input (which is good). The problem is that the same back pressure is also applied to the retry queue. In this situation, PublishJMS will not be called at all any longer. Even when the JMS problem resolves, the whole thing stays deadlocked. Is there a recommended way to avoid such situation? Obviously, an admin can temporarily increase the back pressure threshold of the failure connection, once the JMS problem is resolved. But it would be nicer if the problem could resolve automatically, i.e. PublishJMS should keep retrying somehow. Any hints? Thanks, Arne <backpressure-deadlock.png>
