[
https://issues.apache.org/jira/browse/NIFI-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896339#comment-15896339
]
ASF GitHub Bot commented on NIFI-3497:
--------------------------------------
Github user joewitt commented on a diff in the pull request:
https://github.com/apache/nifi/pull/1564#discussion_r104315717
--- Diff:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ScanAttribute.java
---
@@ -99,12 +114,26 @@
.defaultValue(null)
.build();
+ private static final Validator characterValidator = new
StandardValidators.StringLengthValidator(1, 1);
+
+ public static final PropertyDescriptor
DICTIONARY_ENTRY_METADATA_DEMARCATOR = new PropertyDescriptor.Builder()
+ .name("dictionary-entry-metadata-demarcator")
+ .displayName("Dictionary Entry Metadata Demarcator")
+ .description("A single character used to demarcate the
dictionary entry string between dictionary value and metadata.")
--- End diff --
Considering adding to the description what would happen if no demarcator
was provided. For example, "If no demarcator is specified then there will be
no hit metadata and the entire dictionary term will be matched."
> ScanAttribute should support tagging a flowfile with metadata value from the
> supplied dictionary
> ------------------------------------------------------------------------------------------------
>
> Key: NIFI-3497
> URL: https://issues.apache.org/jira/browse/NIFI-3497
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Joseph Witt
> Assignee: Joseph Witt
>
> Today ScanAttribute just looks through the supplied dictionary and given
> object for a string matching hit. If it hits then it is a match otherwise it
> is a 'not found'. However, when a hit occurs it can often be quite useful to
> gather additional metadata about that hit. This makes cases like
> enrichment/tagging much easier.
> So, plan is to have ScanAttribute support a dictionary value demarcator which
> would separate the dictionary term from some string response that will be
> added to the flowfile. For instance a dictionary might have
> apples:These are red or green
> bananas:These are yellow unless you should toss them or make bread
> Then if a hit occurs on 'apples' the flowfile that contained such an
> attribute would have a new attribute such as 'dictionary.hit.term' = 'apple'
> and 'dictionary.hit.metadata' = 'These are red or green'.
> This means downstream processors could extract that metadata and do
> interesting things with it.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)