Mark Payne created NIP-6:
----------------------------
Summary: Allow Processors to understand state of incoming
connections' backpressure
Key: NIP-6
URL: https://issues.apache.org/jira/browse/NIP-6
Project: NiFi Improvement Proposal
Issue Type: New Feature
Reporter: Mark Payne
*Motivation and description*
There are cases when we need to know whether or not backpressure is applied on
incoming connections. For example, in MergeContent, when a Correlation
Attribute is used, it is very easy to have a situation in which the Min Number
of Entries or Min Bin Size is not met, but in which backpressure is applied
upstream. In this case, we have to just wait for the Max Bin Age to elapse in
order for MergeContent to progress. This introduces significant artificial
delays into our dataflow. And while Max Bin Age is always recommended, it's not
required, so the flow could potentially block indefinitely it not properly
configured.
Instead, it would make sense for MergeContent to determine "all upstream
connections have backpressure applied" and as a result immediately merge the
"most full bin" in order to make progress. This information is not currently
available, though. We need to surface this information via the
{{{}ProcessContext{}}}.
*Scope*
Implementing this will require that the following method be added to
{{{}ProcessContext{}}}:
{code:java}
BackpressureEngagement getBackpressureEngagement(); {code}
where {{BackpressureEngagement}} is defined as:
{code:java}
public enum BackpressureEngagement {
// No upstream connections have backpressure engaged
BACKPRESSURE_NOT_ENGAGED,
// At least one upstream connection has backpressure engaged but not all
BACKPRESSURE_PARTIALLY_ENGAGED,
// All upstream connections have backpressure engaged
BACKPRESSURE_ENGAGED;
} {code}
This, naturally, means that the the method must also be implemented in the
{{StandardProcessContext}} as well as {{ConnectableProcessContext}} and
{{{}MockProcessContext{}}}.
In order to allow proper testing of components, we must also update the
{{TestRunner}} interface and {{StandardProcessorTestRunner}} implementation to
accommodate this with a setter method:
{code:java}
public void setBackpressureEngagement(BackpressureEngagement engagement); {code}
The default value in {{StandardProcessorTestRunner}} should be
{{{}BackpressureEngagement.BACKPRESSURE_NOT_ENGAGED{}}}.
*Alternatives Considered*
As outlined in NIFI-14366, in addition to the prescribed values for
{{BackpressureEngagement}} the following alternatives were considered:
{code:java}
- SINGLE
- IN_A_RELATIONSHIP
- ITS_COMPLICATED
- ENGAGED {code}
Ultimately, I think we should avoid this nomenclature for the following reasons:
* May lead to confusion. I fear we would also need to introduce a value of
"GHOSTED" which would be confusing because it means something very different
from when a Processor or Controller Service is Ghosted.
* Backpressure is tied to a Connection, and Connections are already often
confused with Processor Relationships. This may well further blur that line and
lead to more confusion.
* I'm concerned about the complexity of implementing `ITS_COMPLICATED`. Once
we start factoring in the emotional state of each connection, well... it gets
complicated.
* I'd be concerned about "API sprawl" - for instance, will someone then want
to introduce a `breakup()` method to set the status to SINGLE? The error
handling there could be a nightmare.
It was also suggested that we name the method {{amITheProblem}} with potential
return values of {{YES_YOU_ARE_SLOW}} or {{{}NO_ITS_ME{}}}.
My main concern here is that the value can potentially change rather quickly.
Imagine the logs, indicating why backpressure is engaged:
{code:java}
Backpressure engaged? YES_YOU_ARE_SLOW
Backpressure engaged? NO_ITS_ME
Backpressure engaged? YES_YOU_ARE_SLOW
Backpressure engaged? NO_ITS_ME
Backpressure engaged? YES_YOU_ARE_SLOW
Backpressure engaged? NO_ITS_ME
Backpressure engaged? YES_YOU_ARE_SLOW
Backpressure engaged? NO_ITS_ME{code}
We don't need people thinking of our api is trapped in a loop of self-doubt and
finger-pointing.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)