Hello everyone, I have a case of running the 3rd party CLI (linux) with the following behaviour: - Should be executed upon a FlowFile with attributes/content containing parameters to CLI - Accepts params via flags or environment variables - Writes output to stdout as a stream of JSON objects - The output might be huge (millions and millions of objects), which means caching stdout is not an option - each line/object should be sent as a separate FlowFile - The errors/log is written to stderr (might be very chatty)
Using ExecuteProcessor is not an option (cannot be trigger by incoming FlowFile), but the way it treats stdout is what is desired. Using ExecuteStreamCommand is not an option as it buffers the output until the binary exists with a status code 0. Does anybody know if there’s a hybrid component somewhere out there? ;-) Thank you in advance! P.S. I’ve tried to write a wrapping script in Python using ExecuteScript processor, but: - it looks rather an overkill (JVM -> Jython -> Python -> System process -> …) - scripting for NiFi is not providing a pleasant debugging experience - I get weird random errors when moving flow from machine to machine - exact copies of VMs (like the example below). > Caused by: javax.script.ScriptException: AttributeError: type object > 'java.lang.Thread' has no attribute 'State' in <script> at line number 1 > at > org.python.jsr223.PyScriptEngine.scriptException(PyScriptEngine.java:222) > at org.python.jsr223.PyScriptEngine.eval(PyScriptEngine.java:59) > at org.python.jsr223.PyScriptEngine.eval(PyScriptEngine.java:31) > at > javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:264) > at > org.apache.nifi.script.impl.JythonScriptEngineConfigurator.eval(JythonScriptEngineConfigurator.java:59) > at > org.apache.nifi.processors.script.ExecuteScript.onTrigger(ExecuteScript.java:220) Kind regards, Alexander