[ https://issues.apache.org/jira/browse/NIFI-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896379#comment-15896379 ]
ASF GitHub Bot commented on NIFI-3497: -------------------------------------- Github user joewitt commented on a diff in the pull request: https://github.com/apache/nifi/pull/1564#discussion_r104316712 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ScanAttribute.java --- @@ -60,13 +64,21 @@ @Tags({"scan", "attributes", "search", "lookup"}) @CapabilityDescription("Scans the specified attributes of FlowFiles, checking to see if any of their values are " + "present within the specified dictionary of terms") +@WritesAttributes({ + @WritesAttribute(attribute = "dictionary.hit.{n}.attribute", description = "The attribute name that had a value hit on the dictionary file."), --- End diff -- Also should specify whether the counter starts at 0 or 1 for the first hit. > ScanAttribute should support tagging a flowfile with metadata value from the > supplied dictionary > ------------------------------------------------------------------------------------------------ > > Key: NIFI-3497 > URL: https://issues.apache.org/jira/browse/NIFI-3497 > Project: Apache NiFi > Issue Type: Improvement > Reporter: Joseph Witt > Assignee: Joseph Witt > > Today ScanAttribute just looks through the supplied dictionary and given > object for a string matching hit. If it hits then it is a match otherwise it > is a 'not found'. However, when a hit occurs it can often be quite useful to > gather additional metadata about that hit. This makes cases like > enrichment/tagging much easier. > So, plan is to have ScanAttribute support a dictionary value demarcator which > would separate the dictionary term from some string response that will be > added to the flowfile. For instance a dictionary might have > apples:These are red or green > bananas:These are yellow unless you should toss them or make bread > Then if a hit occurs on 'apples' the flowfile that contained such an > attribute would have a new attribute such as 'dictionary.hit.term' = 'apple' > and 'dictionary.hit.metadata' = 'These are red or green'. > This means downstream processors could extract that metadata and do > interesting things with it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)