[
https://issues.apache.org/jira/browse/NIFI-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15895425#comment-15895425
]
ASF GitHub Bot commented on NIFI-3497:
--------------------------------------
Github user alopresto commented on a diff in the pull request:
https://github.com/apache/nifi/pull/1536#discussion_r104275670
--- Diff:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ScanAttribute.java
---
@@ -97,13 +107,23 @@
.addValidator(StandardValidators.createRegexValidator(0, 1,
false))
.defaultValue(null)
.build();
+ public static final PropertyDescriptor
DICTIONARY_ENTRY_METADATA_DEMARCATOR = new PropertyDescriptor.Builder()
+ .name("Dictionary Entry Metadata Demarcator")
--- End diff --
Yes please. That recommendation was put in place after a lot of the code
already existed, so what we are trying to enforce is that any new properties
follow that model and people will go back and update existing properties as
necessary. There have been many discussions about it, but one serious issue is
that if a property name is changed (to make it more readable, rename it, fix a
typo, etc.) the value stored in the `flow.xml.gz` file will not be picked up
when the flow is read by NiFi and that data can be lost. So the combination of
static `name` and malleable `displayName` solves this issue.
> ScanAttribute should support tagging a flowfile with metadata value from the
> supplied dictionary
> ------------------------------------------------------------------------------------------------
>
> Key: NIFI-3497
> URL: https://issues.apache.org/jira/browse/NIFI-3497
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Joseph Witt
> Assignee: Joseph Witt
>
> Today ScanAttribute just looks through the supplied dictionary and given
> object for a string matching hit. If it hits then it is a match otherwise it
> is a 'not found'. However, when a hit occurs it can often be quite useful to
> gather additional metadata about that hit. This makes cases like
> enrichment/tagging much easier.
> So, plan is to have ScanAttribute support a dictionary value demarcator which
> would separate the dictionary term from some string response that will be
> added to the flowfile. For instance a dictionary might have
> apples:These are red or green
> bananas:These are yellow unless you should toss them or make bread
> Then if a hit occurs on 'apples' the flowfile that contained such an
> attribute would have a new attribute such as 'dictionary.hit.term' = 'apple'
> and 'dictionary.hit.metadata' = 'These are red or green'.
> This means downstream processors could extract that metadata and do
> interesting things with it.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)