Paul,
The first thing I would mention (which helps explain the rest) is that
NiFi uses the Jython [1] engine, it is not pure Python. So although
you cannot use native (CPython) libraries like Pandas, you can import
and use Java classes, which is what the NiFi API is written in. So
when you say "Python functions available for the ExecuteScript API",
what that really means is that your Jython code can call Java methods
on Java objects, such as outputstream.write() or session.transfer().
A good place to start to understand the NiFi API is the Developer
Guide [2]. This covers the concepts you may run into in your script,
in fact way more than you may need. To narrow it down, start with the
ProcessSession API [3]. This shows you the methods you can call on the
session, such as writing to a flow file using a StreamCallback [4],
and within that you can see the process method [5]. For the
ProcessSession, you can also see the transfer() and putAttribute()
methods which are very often used in scripts. There are other places
where you might interact with the API, such as FlowFile [6].
I have a few examples on my blog [7] of using Jython in ExecuteScript,
such as converting JSON into a different JSON structure [8]. I tried
to explain the usage of the NiFi API there though I admit it could be
more in-depth. In addition I plan to write a good amount of
documentation (to include language-specific examples) under NIFI-1954
[9]. I am the author of the ExecuteScript Cookbook series you
mentioned as well. Please feel free to respond with any additional
questions you may have, happy to help.
Regards,
Matt
[1] https://www.jython.org/
[2] https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html
[3]
https://javadoc.io/static/org.apache.nifi/nifi-api/1.16.1/org/apache/nifi/processor/ProcessSession.html
[4]
https://javadoc.io/static/org.apache.nifi/nifi-api/1.16.1/org/apache/nifi/processor/ProcessSession.html#write-org.apache.nifi.flowfile.FlowFile-org.apache.nifi.processor.io.StreamCallback-
[5]
https://javadoc.io/static/org.apache.nifi/nifi-api/1.16.1/org/apache/nifi/processor/io/StreamCallback.html
[6]
https://javadoc.io/static/org.apache.nifi/nifi-api/1.16.1/org/apache/nifi/flowfile/FlowFile.html
[7] https://funnifi.blogspot.com/
[8]
https://funnifi.blogspot.com/2016/03/executescript-json-to-json-revisited_14.html
[9] https://issues.apache.org/jira/browse/NIFI-1954
On Wed, May 4, 2022 at 1:07 PM Markiewicz, Paul wrote:
>
> Hello, my name is PaulM.
>
> I am very new at both NiFi and relatively new at Python.
>
>
> I have successfully cobbled together a NiFi ExecuteScript processor
> (complete with embeded Python Code).
> That python code CAN/DOES read the original Flow-File content, sets/updates
> a NiFi attribute of my choosing,
> repopulates the Flow-File content with some junk (io.environ) data (and some
> other Hello-World lines..).
> Then successfully, returns (from the Python execution), the new attribute
> and new Flow-File content to
> the NiFi ExecuteScript.
>
>
> I had this success by reading a PART-1/PART-2 example (of Python) found
> within a cloudera link..
> (((
> https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-
> part-1/ta-p/248922 ))).
>
>
> What I am hoping to get from this developer mailing list eMail request (((
> dev@nifi.apache.org )))
> is where/how to locate DOCUMENTATION for what is available within the
> **PYTHON SIDE** of these library calls.
>
>
> For example, my Python uses these IMPORT STATEMENTS:
>
>
> import os
> import datetime
>
> from org.python.core.util.FileUtil import wrap
> from org.apache.nifi.processors.script import ExecuteScript
> from org.apache.commons.io import IOUtils
> from java.nio.charset import StandardCharsets
> from org.apache.nifi.processor.io import StreamCallback
>
>
> ...and from the example code (which does work...) I am especially
> interested in
> the PYTHON documentation for "ExecuteScript" so that I can also know things
> like the complement
> FAILURE variable name, to this SUCCESS:
>
> session.transfer(flow_file, ExecuteScript.REL_SUCCESS)
>
>
> ... and learn all the other Python functions available for the ExecuteScript
> API.
>
>
> Hopefully that documentation would ALSO point out important concepts like
> inside my "class PyStreamCallback(StreamCallback):" the execution flow will
> automagically call "process"
> within that class... and all the various (correct) data types that are
> expected within the Python execution.
>
> ... and that the
> "outputStream.write(bytearray(newFlowContentStr.encode('utf-8')))"
> expects a parameter of type bytearray, to the write() function.
>
>
>
> I thank you in advance.
>
> Sincerely,
> PaulM
>
>