[
https://issues.apache.org/jira/browse/NIFI-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896333#comment-15896333
]
ASF GitHub Bot commented on NIFI-3497:
--------------------------------------
Github user joewitt commented on a diff in the pull request:
https://github.com/apache/nifi/pull/1564#discussion_r104315645
--- Diff:
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ScanAttribute.java
---
@@ -60,13 +64,21 @@
@Tags({"scan", "attributes", "search", "lookup"})
@CapabilityDescription("Scans the specified attributes of FlowFiles,
checking to see if any of their values are "
+ "present within the specified dictionary of terms")
+@WritesAttributes({
+ @WritesAttribute(attribute = "dictionary.hit.{n}.attribute",
description = "The attribute name that had a value hit on the dictionary
file."),
+ @WritesAttribute(attribute = "dictionary.hit.{n}.term",
description = "The term that had a hit on the dictionary file."),
+ @WritesAttribute(attribute = "dictionary.hit.{n}.metadata",
description = "The metadata returned from the dictionary file associated with
the term hit.")
+})
+
+
public class ScanAttribute extends AbstractProcessor {
public static final String MATCH_CRITERIA_ALL = "All Must Match";
public static final String MATCH_CRITERIA_ANY = "At Least 1 Must
Match";
public static final PropertyDescriptor MATCHING_CRITERIA = new
PropertyDescriptor.Builder()
- .name("Match Criteria")
--- End diff --
Must retain the previously existing name so as not to disturb existing
configurations. Andy's advice for the new property is good to follow going
forward but it is wise to avoid changing old names.
> ScanAttribute should support tagging a flowfile with metadata value from the
> supplied dictionary
> ------------------------------------------------------------------------------------------------
>
> Key: NIFI-3497
> URL: https://issues.apache.org/jira/browse/NIFI-3497
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Joseph Witt
> Assignee: Joseph Witt
>
> Today ScanAttribute just looks through the supplied dictionary and given
> object for a string matching hit. If it hits then it is a match otherwise it
> is a 'not found'. However, when a hit occurs it can often be quite useful to
> gather additional metadata about that hit. This makes cases like
> enrichment/tagging much easier.
> So, plan is to have ScanAttribute support a dictionary value demarcator which
> would separate the dictionary term from some string response that will be
> added to the flowfile. For instance a dictionary might have
> apples:These are red or green
> bananas:These are yellow unless you should toss them or make bread
> Then if a hit occurs on 'apples' the flowfile that contained such an
> attribute would have a new attribute such as 'dictionary.hit.term' = 'apple'
> and 'dictionary.hit.metadata' = 'These are red or green'.
> This means downstream processors could extract that metadata and do
> interesting things with it.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)