[ 
https://issues.apache.org/jira/browse/NIFI-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579336#comment-14579336
 ] 

Ricky Saltzer commented on NIFI-665:
------------------------------------

[[email protected]] - 

When building custom processors, it's often useful to store derived data into a 
flow file's attributes, as that value could prove to be useful (or even 
necessary) later downstream. Whether you use that derived attribute for writing 
the file to a specific location, or you use it to help route the file down a 
different downstream path. Attributes are really useful for more granular 
control over a flow file's ingestion path.

That being said, if you want to perform some simple analysis of a file in order 
to derive these attributes, you either have to hope you can do it with 
ExtractText (which does not fit all use cases), or you have to create a custom 
processor. Creating a custom processor adds a lot of development / 
administrative overhead when you need to do something very trivial, which could 
be done with a Python, or shell script. 

This patch introduces the ability for ExecuteProcess & ExecuteStreamProcess to 
create and update attributes on the flow file without the need to create custom 
processors in Java. 

I think your idea, combined with this would compliment each other. Your script 
might sometimes want to set more than one attribute, which would then require 
an unnecessary amount of processor chaining (e.g. derive 20 values from a 
customer file). On the other hand, some use cases would benefit from it, for 
example, an attribute you derive from processor C could be required to derive 
an attribute from processor E. 

Hope this helps make things clearer.

Ricky 

> Design method for attaching attributes using 
> ExecuteProcess/ExecuteStreamCommand
> --------------------------------------------------------------------------------
>
>                 Key: NIFI-665
>                 URL: https://issues.apache.org/jira/browse/NIFI-665
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Ricky Saltzer
>            Assignee: Ricky Saltzer
>         Attachments: NIFI-665.file-based.patch, sample-script.sh
>
>
> Currently, the ExecuteProcess and ExecuteStreamCommand processors can only 
> consume and produce new flow files. Although they can access attributes 
> attached to the flow files through the use of environment variables, they 
> have no way to update or attach new attributes. 
> I think it could be really useful if there was a generic way for us to attach 
> flow file attributes using these two processors. Since environment variables 
> created or updated by a flow file are only visible to the executing process, 
> we'll unfortunately not be able to utilize it. 
> The purpose of this JIRA is to provide a forum for how we could potentially 
> solve this. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to