[
https://issues.apache.org/jira/browse/NIFI-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
endzeit updated NIFI-12386:
---------------------------
Description:
Flows in Apache NiFi can get quite sophisticated, consisting of a long chains
of both {{ProcessGroup}} and {{Processor}} components.
Oftentimes {{Processor}} components, including those in the NiFi standard
bundle, enrich an incoming {{FlowFile}} with additional FlowFile attributes.
This can lead to a fair amount of different FlowFile attributes accumulating
over the FlowFile's lifecycle.
In order to prevent subsequent {{ProcessGroup}} / {{Processor}} components to
accidentally rely on implementation details of preceding components, a good
practice is to:
# define which FlowFile attributes should exist at selected points in the
{{Flow}}
# reduce the attributes of the FlowFile at the selected point to those defined
This can be achieved by using the
[UpdateAttribute|https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.23.2/org.apache.nifi.processors.attributes.UpdateAttribute/index.html]
processor of the standard processor bundle.
However, the {{UpdateAttribute}} processor allows only for a regular expression
to define a set of attributes to remove. Instead, the outlined practice above
desires to explicitly state a set of attributes to keep. One can do so with a
regular expression as well, but writing the reverse lookup to achieve this is
not the easiest endeavor to put it mildly.
This issue proposes a new processor {{FilterAttribute}} to be added to the
library of {{{}nifi-standard-processors{}}}, which can be configured with a set
of attributes and removes all attributes of an incoming FlowFile other than the
ones configured.
The processor should
* have a required, non-blank property "Attributes to keep", which takes a list
of attribute names separated by delimiter, e.g. comma (,).
** trailing whitespace around attribute names should be ignored
** leading or trailing delimiters should be ignored
* have a required, non-blank property "Delimiter", which is used to delimit
the individual attribute names, with a default value of "," (comma).
* have a single relationship "success" to which all FlowFiles are routed,
similar to {{UpdateAttribute}}
* have an
[InputRequirement|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/InputRequirement.html]
of
[INPUT_REQUIRED|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/InputRequirement.Requirement.html]
*
[@SupportsBatching|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/SupportsBatching.html]
* be
[@SideEffectFree|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/SideEffectFree.html]
Some possible extension might be:
* have a required property "Core attributes", with allowable values of "Keep
UUID only", "Keep all", with a default of "Keep UUID only"
** an additional allowable value e.g. "Specify behaviour" may be added, which
allows for more customization
* have a required property "Mode", with allowable values of "Retain" and
"Remove", with a default of "Retain"
was:
Flows in Apache NiFi can get quite sophisticated, consisting of a long chains
of both {{ProcessGroup}} and {{Processor}} components.
Oftentimes {{Processor}} components, including those in the NiFi standard
bundle, enrich an incoming {{FlowFile}} with additional FlowFile attributes.
This can lead to a fair amount of different FlowFile attributes accumulating
over the FlowFile's lifecycle.
In order to prevent subsequent {{ProcessGroup}} / {{Processor}} components to
accidentally rely on implementation details of preceding components, a good
practice is to:
# define which FlowFile attributes should exist at selected points in the
{{Flow}}
# reduce the attributes of the FlowFile at the selected point to those defined
This can be achieved by using the
[UpdateAttribute|https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.23.2/org.apache.nifi.processors.attributes.UpdateAttribute/index.html]
processor of the standard processor bundle.
However, the {{UpdateAttribute}} processor allows only for a regular expression
to define a set of attributes to remove. Instead, the outlined practice above
desires to explicitly state a set of attributes to keep. One can do so with a
regular expression as well, but writing the reverse lookup to achieve this is
not the easiest endeavor to put it mildly.
This issue proposes a new processor {{FilterAttributes}} to be added to the
library of {{{}nifi-standard-processors{}}}, which can be configured with a set
of attributes and removes all attributes of an incoming FlowFile other than the
ones configured.
The processor should
* have a required, non-blank property "Attributes to keep", which takes a list
of attribute names separated by delimiter, e.g. comma (,).
** trailing whitespace around attribute names should be ignored
** leading or trailing delimiters should be ignored
* have a required, non-blank property "Delimiter", which is used to delimit
the individual attribute names, with a default value of "," (comma).
* have a single relationship "success" to which all FlowFiles are routed,
similar to {{UpdateAttribute}}
* have an
[InputRequirement|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/InputRequirement.html]
of
[INPUT_REQUIRED|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/InputRequirement.Requirement.html]
*
[@SupportsBatching|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/SupportsBatching.html]
* be
[@SideEffectFree|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/SideEffectFree.html]
Some possible extension might be:
* have a required property "Core attributes", with allowable values of "Keep
UUID only", "Keep all", with a default of "Keep UUID only"
** an additional allowable value e.g. "Specify behaviour" may be added, which
allows for more customization
* have a required property "Mode", with allowable values of "Retain" and
"Remove", with a default of "Retain"
> Add a FilterAttribute processor
> -------------------------------
>
> Key: NIFI-12386
> URL: https://issues.apache.org/jira/browse/NIFI-12386
> Project: Apache NiFi
> Issue Type: New Feature
> Components: Core Framework
> Reporter: endzeit
> Assignee: endzeit
> Priority: Major
> Time Spent: 10.5h
> Remaining Estimate: 0h
>
> Flows in Apache NiFi can get quite sophisticated, consisting of a long chains
> of both {{ProcessGroup}} and {{Processor}} components.
> Oftentimes {{Processor}} components, including those in the NiFi standard
> bundle, enrich an incoming {{FlowFile}} with additional FlowFile attributes.
> This can lead to a fair amount of different FlowFile attributes accumulating
> over the FlowFile's lifecycle.
> In order to prevent subsequent {{ProcessGroup}} / {{Processor}} components to
> accidentally rely on implementation details of preceding components, a good
> practice is to:
> # define which FlowFile attributes should exist at selected points in the
> {{Flow}}
> # reduce the attributes of the FlowFile at the selected point to those
> defined
> This can be achieved by using the
> [UpdateAttribute|https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-update-attribute-nar/1.23.2/org.apache.nifi.processors.attributes.UpdateAttribute/index.html]
> processor of the standard processor bundle.
> However, the {{UpdateAttribute}} processor allows only for a regular
> expression to define a set of attributes to remove. Instead, the outlined
> practice above desires to explicitly state a set of attributes to keep. One
> can do so with a regular expression as well, but writing the reverse lookup
> to achieve this is not the easiest endeavor to put it mildly.
> This issue proposes a new processor {{FilterAttribute}} to be added to the
> library of {{{}nifi-standard-processors{}}}, which can be configured with a
> set of attributes and removes all attributes of an incoming FlowFile other
> than the ones configured.
> The processor should
> * have a required, non-blank property "Attributes to keep", which takes a
> list of attribute names separated by delimiter, e.g. comma (,).
> ** trailing whitespace around attribute names should be ignored
> ** leading or trailing delimiters should be ignored
> * have a required, non-blank property "Delimiter", which is used to delimit
> the individual attribute names, with a default value of "," (comma).
> * have a single relationship "success" to which all FlowFiles are routed,
> similar to {{UpdateAttribute}}
> * have an
> [InputRequirement|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/InputRequirement.html]
> of
> [INPUT_REQUIRED|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/InputRequirement.Requirement.html]
> *
> [@SupportsBatching|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/SupportsBatching.html]
> * be
> [@SideEffectFree|https://www.javadoc.io/doc/org.apache.nifi/nifi-api/latest/org/apache/nifi/annotation/behavior/SideEffectFree.html]
> Some possible extension might be:
> * have a required property "Core attributes", with allowable values of "Keep
> UUID only", "Keep all", with a default of "Keep UUID only"
> ** an additional allowable value e.g. "Specify behaviour" may be added,
> which allows for more customization
> * have a required property "Mode", with allowable values of "Retain" and
> "Remove", with a default of "Retain"
--
This message was sent by Atlassian Jira
(v8.20.10#820010)