[
https://issues.apache.org/jira/browse/NIFI-8195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369232#comment-17369232
]
Mark Bean edited comment on NIFI-8195 at 6/25/21, 4:47 PM:
-----------------------------------------------------------
Thanks for catching the issue with the VersionedProcessGroup [~markap14] . I
agree that's a problem if the PG pulled from the Registry does not behave like
the one put in. I'll take a look at that, and resubmit a PR.
As for the FlowFile Expiration, this ticket has a different approach than the
discussion in [#4800](https://github.com/apache/nifi/pull/4800). Previously, it
was proposed to have the default expiration in nifi.properties. This is
problematic for several reasons, not the least of which is it is a global
setting. The value would have applied throughout the entire flow, and changes
require restarting NiFi. Here, the setting can be confined to specific portions
of the overall flow. Additionally, it can be modified dynamically without
restarting the application.
Respectfully, I have a valid use case for allowing the expiration to be set at
the Process Group level. The use case is for flows which have time-based
compliance issues to consider. In some flows, the data is only allowed to exist
on the system for a specific period of time. This can be either compliance
issues or for data relevance. For example, to comply with service level
agreements or legal requirements, data may not be eligible to be processed for
longer than X minutes/hours. Also, if time-sensitive data is not fully
processed within X minutes/hours, then its ultimate result may no longer be
relevant. It is far easier for the flow designer to have an appropriate default
value when creating - and operating - the flow than to have to configure dozens
or even hundreds of connections to the same expiration value. In addition, the
expiration value is both set by an authorized user as well as visually
indicated on the canvas. Therefore, the possibility of "inadvertent data loss"
is not transparent. Subsequent data loss falls in the category of user
negligence or flow design error - not a faulty application.
Having said that, I fully recognize your concern, and agree with it. NiFi has
always been as conservative as possible in terms of removing data; this should
continue. I do not believe the proposed feature violates this core tenet of
NiFi. I believe conservative protection of data should not prohibit convenient
design and operation when required for compliance or time-sensitive relevance.
Flowfile expiration within a queue is destructive by its nature. However, its
existence reveals that sometimes expiration and removal is necessary. Most
importantly, its necessity is not always confined to a singular queue. Rather,
it may exist within the context of an entire flow for a given tenant.
In summary, I think the expiration feature at the Process Group level enables
both convenient flow design as well as compliant operation with tenant policies
or service level agreements. In addition, it is purely optional and under the
control of authorized users; default values (and clear documentation) are
consistent with the current default of no expiration. I do not believe a useful
feature should be avoided solely for concern that a user was unaware of the
consequences of the expiration parameter.
was (Author: markbean):
Thanks for catching the issue with the VersionedProcessGroup @markap14 . I
agree that's a problem if the PG pulled from the Registry does not behave like
the one put in. I'll take a look at that, and resubmit a PR.
As for the FlowFile Expiration, this ticket has a different approach than the
discussion in [#4800](https://github.com/apache/nifi/pull/4800). Previously, it
was proposed to have the default expiration in nifi.properties. This is
problematic for several reasons, not the least of which is it is a global
setting. The value would have applied throughout the entire flow, and changes
require restarting NiFi. Here, the setting can be confined to specific portions
of the overall flow. Additionally, it can be modified dynamically without
restarting the application.
Respectfully, I have a valid use case for allowing the expiration to be set at
the Process Group level. The use case is for flows which have time-based
compliance issues to consider. In some flows, the data is only allowed to exist
on the system for a specific period of time. This can be either compliance
issues or for data relevance. For example, to comply with service level
agreements or legal requirements, data may not be eligible to be processed for
longer than X minutes/hours. Also, if time-sensitive data is not fully
processed within X minutes/hours, then its ultimate result may no longer be
relevant. It is far easier for the flow designer to have an appropriate default
value when creating - and operating - the flow than to have to configure dozens
or even hundreds of connections to the same expiration value. In addition, the
expiration value is both set by an authorized user as well as visually
indicated on the canvas. Therefore, the possibility of "inadvertent data loss"
is not transparent. Subsequent data loss falls in the category of user
negligence or flow design error - not a faulty application.
Having said that, I fully recognize your concern, and agree with it. NiFi has
always been as conservative as possible in terms of removing data; this should
continue. I do not believe the proposed feature violates this core tenet of
NiFi. I believe conservative protection of data should not prohibit convenient
design and operation when required for compliance or time-sensitive relevance.
Flowfile expiration within a queue is destructive by its nature. However, its
existence reveals that sometimes expiration and removal is necessary. Most
importantly, its necessity is not always confined to a singular queue. Rather,
it may exist within the context of an entire flow for a given tenant.
In summary, I think the expiration feature at the Process Group level enables
both convenient flow design as well as compliant operation with tenant policies
or service level agreements. In addition, it is purely optional and under the
control of authorized users; default values (and clear documentation) are
consistent with the current default of no expiration. I do not believe a useful
feature should be avoided solely for concern that a user was unaware of the
consequences of the expiration parameter.
> Migrate backpressure settings from nifi.properties to Process Group
> -------------------------------------------------------------------
>
> Key: NIFI-8195
> URL: https://issues.apache.org/jira/browse/NIFI-8195
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Reporter: Mark Bean
> Assignee: Mark Bean
> Priority: Minor
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> Currently, there are properties in the nifi.properties file that provide
> defaults for backpressure limits of newly created connections:
> nifi.queue.backpressure.count=
> nifi.queue.backpressure.size=
> It is a bit heavy-handed to set these at the application level. It would be
> more appropriate to be configurable at a Process Group level.
> Add configuration properties to the Process Group for these settings. To
> maintain backward compatibility, the properties may remain in
> nifi.properties. If provided in nifi.properties, these values would be
> provided as the default values for newly created Process Groups.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)