[
https://issues.apache.org/jira/browse/NIFI-90?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241646#comment-14241646
]
Mark Payne commented on NIFI-90:
--------------------------------
Benefits and drawbacks to consider:
Benefit:
Developers often forget to call the penalize method, resulting in FlowFiles
often not being penalized when they should. This fixes the issue.
Drawback:
Take for example a PutSFTP processor. If we transfer a 1 GB file and then fail
to rename, with the current implementation it would immediately get penalized.
However, with the new implementation we may have to transfer 3 GB of data
before penalizing. In the mean time, many other FlowFiles could have been
processed during the time that it took to transfer 3 GB.
Drawback:
A complex flow could consist of something like DistributeLoad ->
UpdateAttribute -> RouteOnAttribute -> PutSFTP -> back to DistributeLoad. In
this case, we may never penalize when we should because we could be 6 hops out.
Other possible solutions we should consider:
* Allow developers to mark Relationships as 'penalizable' rather than having to
call penalize each time.
* Allow penalization to be set instead of Connections, and everything that is
put onto that connection is penalized. This way, if we have a flow that pushes
to DistributeLoad, all is okay but if the FlowFile then fails to be pushed to
an external system, for instance, the Connection that houses the 'failure'
relationship can be penalized.
* Allow for both explicit penalization and automatic penalization
> Replace explicit penalization with automatic penalization/back-off
> ------------------------------------------------------------------
>
> Key: NIFI-90
> URL: https://issues.apache.org/jira/browse/NIFI-90
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Reporter: Joseph Witt
> Priority: Minor
>
> Rather than having users configure explicit penalization periods and
> requiring developers to implement it in their processors we can automate
> this. Perhaps keep a LinkedHashMap<Connection ID, Counter> of size 5 or so
> in the FlowFileRecord construct. When a FlowFile is routed to a Connection,
> the counter is incremented. If the counter exceeds 3 visits to the same
> connection, the FlowFile will be automatically penalized. This protects us
> "5 hops out" so that if we have something like DistributeLoad -> PostHTTP
> with failure looping back to DistributeLoad, we will still penalize when
> appropriate.
> In addition, we will remove the configuration option from the UI, setting the
> penalization period to some default such as 5 seconds.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)