Mark Payne created NIFI-11240:
---------------------------------

             Summary: Introduce Python API for building Processors
                 Key: NIFI-11240
                 URL: https://issues.apache.org/jira/browse/NIFI-11240
             Project: Apache NiFi
          Issue Type: Epic
          Components: Core Framework, Documentation & Website, Extensions
            Reporter: Mark Payne
            Assignee: Mark Payne


The scripting processors are very common for data transformation in NiFi. In 
particular, the Jython based scripts are quite heavily used. However, Jython is 
run on the JVM and does not support CPython libraries. As a result, it's syntax 
compatible but doesn't make use of the wealth of Python libraries. And the 
wealth of Python libraries are what make Python popular to begin with.

Additionally, use of many script-based processors hurts the UX. They are 
cumbersome to configure, with script files and/or script bodies. They result in 
a dataflow that's difficult to understand because instead of nicely named 
processors like CompressContent the type and default name are "ExecuteScript." 
They're also difficult to share.

I have been playing with Py4J for introduce a true Python-based API for 
developing Processors. This will introduce new APIs, new framework changes, and 
documentation. And this will likely take a while to stabilize. However, the 
sooner that we are able to land it into the hands of users, the better. 
Therefore, I pose that we introduce it in multiple milestones. We can create 
sub-tickets for different milestones, but in general it should follow:

Milestone 1: Initial implementation. Provides the capability and an API for 
building processors. Includes sample code and some documentation. Includes 
tests to ensure proper operation. Should not be used in production. API will 
not be stable and may change frequently. Performance may be subpar. Get into 
the hands of developers to begin exploring and providing feedback / submitting 
PRs.

Milestone 2: Bug fixes. API refinement. Improve performance.

Milestone 3: Additional bug fixes and API refinement. API should become more 
stable.

Milestone 4: Additional bug fixes. API becomes stable. Documentation is clear 
and sufficient. Recommend production use.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to