[ 
https://issues.apache.org/jira/browse/NIFI-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527045#comment-14527045
 ] 

Mark Payne commented on NIFI-583:
---------------------------------

[~rickysaltzer], sorry, I made some typos in my previous answer. So it probably 
made no sense :)

You are correct in that ExecuteProcess does not look for incoming FlowFiles. 
The framework doesn't even expose to the processor the fact of whether or not 
the processor has incoming connections. Generally, this is the pattern that we 
follow for a Source Processor. 

The intent is for ExecuteProcess to be used as source processor and 
ExecuteStreamCommand to be used as a non-source processor. My suggestion is to 
modify the ExecuteStreamCommand processor so that we can tell it not to stream 
STDIN to the process.

This approach is backward compatible and I think it's a pretty reasonable thing 
to do - to run a process without needing content of the FlowFile. Maybe you 
just need the attributes, or just want to trigger some command every time a 
FlowFile comes in. It also avoids adding a new relationship that only makes 
sense in a few use cases.

Would that suggestion meet your use case?

> Allow ExecuteProcess to consume an incoming flowfile
> ----------------------------------------------------
>
>                 Key: NIFI-583
>                 URL: https://issues.apache.org/jira/browse/NIFI-583
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Ricky Saltzer
>
> In some cases it would be really nice to allow a FlowFile to trigger an OS 
> action. For instance, after a daily dump of data is written to an Impala 
> table in HDFS, I would like to execute a refresh on the table via the shell. 
> As it stands, the ExecuteProcess processor will allow a FlowFile in a 
> connection to trigger execution, but unless your connection has an expiration 
> set, the FlowFile will stay there indefinitely. The main issue here is that 
> it will continue to re-execute your ExecuteProcess processor over and over. 
> As far as I know, there's only two clear ways around this. (1) - you can use 
> the ExecuteStreamCommand, instead, but *only* if that command can properly 
> handle STDIN. (2) - you can set your ExecuteProcess processor to execute on a 
> schedule (e.g. 1 per minute) and expire the FlowFile before it can re-execute 
> (e.g. 10 seconds). 
> It would be useful if the ExecuteProcess processor consumed the FlowFile, and 
> passed it through a "passthrough" relationship of some kind. A second option 
> would be to make it configurable (false by default) to drop the FlowFile, or 
> to pass it through a second relationship, that way it doesn't break anyone's 
> current pipelines. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to