[ 
https://issues.apache.org/jira/browse/NIFI-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896376#comment-15896376
 ] 

ASF GitHub Bot commented on NIFI-3497:
--------------------------------------

Github user joewitt commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1564#discussion_r104316533
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ScanAttribute.java
 ---
    @@ -60,13 +64,21 @@
     @Tags({"scan", "attributes", "search", "lookup"})
     @CapabilityDescription("Scans the specified attributes of FlowFiles, 
checking to see if any of their values are "
             + "present within the specified dictionary of terms")
    +@WritesAttributes({
    +        @WritesAttribute(attribute = "dictionary.hit.{n}.attribute", 
description = "The attribute name that had a value hit on the dictionary 
file."),
    --- End diff --
    
    sorry for not thinking of this earlier but i think we should use 
'dictionary.hit.attribute.{n}' for these three attributes we add.  This makes 
it easier for downstream users of these attributes to select attributes of 
interest because they can say things like 'if any attribute starts with 
dictionary.hit.attribute with a value of foo'.
    
    If you agree please do change that.


> ScanAttribute should support tagging a flowfile with metadata value from the 
> supplied dictionary
> ------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-3497
>                 URL: https://issues.apache.org/jira/browse/NIFI-3497
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Joseph Witt
>            Assignee: Joseph Witt
>
> Today ScanAttribute just looks through the supplied dictionary and given 
> object for a string matching hit.  If it hits then it is a match otherwise it 
> is a 'not found'.  However, when a hit occurs it can often be quite useful to 
> gather additional metadata about that hit.  This makes cases like 
> enrichment/tagging much easier.
> So, plan is to have ScanAttribute support a dictionary value demarcator which 
> would separate the dictionary term from some string response that will be 
> added to the flowfile.  For instance a dictionary might have
> apples:These are red or green
> bananas:These are yellow unless you should toss them or make bread
> Then if a hit occurs on 'apples' the flowfile that contained such an 
> attribute would have a new attribute such as 'dictionary.hit.term' = 'apple' 
> and 'dictionary.hit.metadata' = 'These are red or green'.
> This means downstream processors could extract that metadata and do 
> interesting things with it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to