[
https://issues.apache.org/jira/browse/NIFI-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770528#comment-15770528
]
ASF GitHub Bot commented on NIFI-1856:
--------------------------------------
Github user rkarthik29 commented on the issue:
https://github.com/apache/nifi/pull/1355
added a property to executeSTreamCommand "Redirect Error Output".
[log,output stream, error stream]. log is the default.
When set to log, the error stream is captured and output is logged to the
Nifi app logs at the warn level.Also, the errors are available in the original
stream as an attribute, which is the current behavior.
When set to output stream, the redirecterrorstream is set to true so all
errors go to the output streams and are available in the output stream relation
when set to error stream, the error output is forwarded to an errorflowfile
which is available in the error stream relationship.The error stream
relationship is new.
Added a process.destroyforcibly of the processor unscheduled event.
Checked execute process, it was already capturing the error stream, so no
change needed.
> ExecuteStreamCommand Needs to Consume Standard Error
> ----------------------------------------------------
>
> Key: NIFI-1856
> URL: https://issues.apache.org/jira/browse/NIFI-1856
> Project: Apache NiFi
> Issue Type: Bug
> Reporter: Alan Jackoway
> Assignee: Karthik Narayanan
>
> I was using ExecuteStreamProcess to run certain hdfs commands that are tricky
> to write in nifi but easy in bash (e.g. {{hadoop fs -rm -r
> /data/*/2014/05/05}})
> However, my larger commands kept hanging even though when I run them from the
> command line they finish quickly.
> Based on
> http://www.javaworld.com/article/2071275/core-java/when-runtime-exec---won-t.html
> I believe that ExecuteStreamCommand and possibly other processors need to
> consume the standard error stream to prevent the processes from blocking when
> standard error gets filled.
> To reproduce. Create this as ~/write.py
> {code:python}
> import sys
> count = int(sys.argv[1])
> for x in range(count):
> sys.stderr.write("ERROR %d\n" % x)
> sys.stdout.write("OUTPUT %d\n" % x)
> {code}
> Create a flow that goes
> # GenerateFlowFile - 5 minutes schedule 0 bytes size
> # ExecuteStreamCommand - Command arguments /Users/alanj/write.py;100 Command
> Path python
> # PutFile - /tmp/write/
> routing output stream of ExecuteStreamCommand to PutFile
> When you turn everything on, you get 100 lines (not 200) of just the standard
> output in /tmp/write.
> Next, change the command arguments to /Users/alanj/write.py;100000 and turn
> everything on again. The command will hang.
> I believe that whenever you execute a process the way ExecuteStreamCommand is
> doing, you need to consume the standard error stream to keep it from
> blocking. This may also affect things like ExecuteProcess and ExecuteScript
> as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)