[jira] [Commented] (NIFI-11240) Introduce Python API for building Processors

2023-11-15 Thread Denis Jakupovic (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786236#comment-17786236
 ] 

Denis Jakupovic commented on NIFI-11240:


Is the Milestone 4 reached and the python api therefore production ready or are 
we now in Milestone 1 with the nifi 2.0.0 ML1 release?

> Introduce Python API for building Processors
> 
>
> Key: NIFI-11240
> URL: https://issues.apache.org/jira/browse/NIFI-11240
> Project: Apache NiFi
>  Issue Type: Epic
>  Components: Core Framework, Documentation  Website, Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0-M1
>
>
> The scripting processors are very common for data transformation in NiFi. In 
> particular, the Jython based scripts are quite heavily used. However, Jython 
> is run on the JVM and does not support CPython libraries. As a result, it's 
> syntax compatible but doesn't make use of the wealth of Python libraries. And 
> the wealth of Python libraries are what make Python popular to begin with.
> Additionally, use of many script-based processors hurts the UX. They are 
> cumbersome to configure, with script files and/or script bodies. They result 
> in a dataflow that's difficult to understand because instead of nicely named 
> processors like CompressContent the type and default name are 
> "ExecuteScript." They're also difficult to share.
> I have been playing with Py4J for introduce a true Python-based API for 
> developing Processors. This will introduce new APIs, new framework changes, 
> and documentation. And this will likely take a while to stabilize. However, 
> the sooner that we are able to land it into the hands of users, the better. 
> Therefore, I pose that we introduce it in multiple milestones. We can create 
> sub-tickets for different milestones, but in general it should follow:
> Milestone 1: Initial implementation. Provides the capability and an API for 
> building processors. Includes sample code and some documentation. Includes 
> tests to ensure proper operation. Should not be used in production. API will 
> not be stable and may change frequently. Performance may be subpar. Get into 
> the hands of developers to begin exploring and providing feedback / 
> submitting PRs.
> Milestone 2: Bug fixes. API refinement. Improve performance.
> Milestone 3: Additional bug fixes and API refinement. API should become more 
> stable.
> Milestone 4: Additional bug fixes. API becomes stable. Documentation is clear 
> and sufficient. Recommend production use.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11240) Introduce Python API for building Processors

2023-11-07 Thread Clement Law (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783567#comment-17783567
 ] 

Clement Law commented on NIFI-11240:


Question: do we have any thoughts about how to monitor the resource usage of 
the python servers? As it stands, I believe we can easily monitor things in the 
JVM, but these "headless" python servers will be outside that world. If the 
resource usage of the python processors can be bubbled up to the UI in the 
conventional way, that would make this option a lot more appetising.

> Introduce Python API for building Processors
> 
>
> Key: NIFI-11240
> URL: https://issues.apache.org/jira/browse/NIFI-11240
> Project: Apache NiFi
>  Issue Type: Epic
>  Components: Core Framework, Documentation  Website, Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>
> The scripting processors are very common for data transformation in NiFi. In 
> particular, the Jython based scripts are quite heavily used. However, Jython 
> is run on the JVM and does not support CPython libraries. As a result, it's 
> syntax compatible but doesn't make use of the wealth of Python libraries. And 
> the wealth of Python libraries are what make Python popular to begin with.
> Additionally, use of many script-based processors hurts the UX. They are 
> cumbersome to configure, with script files and/or script bodies. They result 
> in a dataflow that's difficult to understand because instead of nicely named 
> processors like CompressContent the type and default name are 
> "ExecuteScript." They're also difficult to share.
> I have been playing with Py4J for introduce a true Python-based API for 
> developing Processors. This will introduce new APIs, new framework changes, 
> and documentation. And this will likely take a while to stabilize. However, 
> the sooner that we are able to land it into the hands of users, the better. 
> Therefore, I pose that we introduce it in multiple milestones. We can create 
> sub-tickets for different milestones, but in general it should follow:
> Milestone 1: Initial implementation. Provides the capability and an API for 
> building processors. Includes sample code and some documentation. Includes 
> tests to ensure proper operation. Should not be used in production. API will 
> not be stable and may change frequently. Performance may be subpar. Get into 
> the hands of developers to begin exploring and providing feedback / 
> submitting PRs.
> Milestone 2: Bug fixes. API refinement. Improve performance.
> Milestone 3: Additional bug fixes and API refinement. API should become more 
> stable.
> Milestone 4: Additional bug fixes. API becomes stable. Documentation is clear 
> and sufficient. Recommend production use.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11240) Introduce Python API for building Processors

2023-11-05 Thread Clement Law (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17783125#comment-17783125
 ] 

Clement Law commented on NIFI-11240:


Please bear with me as an outside observer looking in, will these changes make 
it into the main branch for release at some point? I see that the PRs have been 
accepted, but closed with merge. From reading the ticket description, it seems 
we're missing a few mile stones before this ticket should be closed (no 
recommendation for release to prod, at least).

Is the idea that people test this initial implementation out for a while first?

> Introduce Python API for building Processors
> 
>
> Key: NIFI-11240
> URL: https://issues.apache.org/jira/browse/NIFI-11240
> Project: Apache NiFi
>  Issue Type: Epic
>  Components: Core Framework, Documentation  Website, Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>
> The scripting processors are very common for data transformation in NiFi. In 
> particular, the Jython based scripts are quite heavily used. However, Jython 
> is run on the JVM and does not support CPython libraries. As a result, it's 
> syntax compatible but doesn't make use of the wealth of Python libraries. And 
> the wealth of Python libraries are what make Python popular to begin with.
> Additionally, use of many script-based processors hurts the UX. They are 
> cumbersome to configure, with script files and/or script bodies. They result 
> in a dataflow that's difficult to understand because instead of nicely named 
> processors like CompressContent the type and default name are 
> "ExecuteScript." They're also difficult to share.
> I have been playing with Py4J for introduce a true Python-based API for 
> developing Processors. This will introduce new APIs, new framework changes, 
> and documentation. And this will likely take a while to stabilize. However, 
> the sooner that we are able to land it into the hands of users, the better. 
> Therefore, I pose that we introduce it in multiple milestones. We can create 
> sub-tickets for different milestones, but in general it should follow:
> Milestone 1: Initial implementation. Provides the capability and an API for 
> building processors. Includes sample code and some documentation. Includes 
> tests to ensure proper operation. Should not be used in production. API will 
> not be stable and may change frequently. Performance may be subpar. Get into 
> the hands of developers to begin exploring and providing feedback / 
> submitting PRs.
> Milestone 2: Bug fixes. API refinement. Improve performance.
> Milestone 3: Additional bug fixes and API refinement. API should become more 
> stable.
> Milestone 4: Additional bug fixes. API becomes stable. Documentation is clear 
> and sufficient. Recommend production use.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11240) Introduce Python API for building Processors

2023-10-04 Thread Janis Ax (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17771832#comment-17771832
 ] 

Janis Ax commented on NIFI-11240:
-

[~markap14] I suggest adding some examples to the documentation. So new users 
can get some ideas and orientation. You already provided some 
[examples|https://drive.google.com/drive/folders/1VCtNQmThAHL44-t2ORdav9YPIHMvCk_b]
 maybe we can reuse them. 

I think the following topics are interesting:
 * Logging
 * Work with content
 * Work with attributes
 * Work with Record Orientated data
 * Processor as package / modul 
 * Working with properties 
 * Relationships 

 

> Introduce Python API for building Processors
> 
>
> Key: NIFI-11240
> URL: https://issues.apache.org/jira/browse/NIFI-11240
> Project: Apache NiFi
>  Issue Type: Epic
>  Components: Core Framework, Documentation  Website, Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>
> The scripting processors are very common for data transformation in NiFi. In 
> particular, the Jython based scripts are quite heavily used. However, Jython 
> is run on the JVM and does not support CPython libraries. As a result, it's 
> syntax compatible but doesn't make use of the wealth of Python libraries. And 
> the wealth of Python libraries are what make Python popular to begin with.
> Additionally, use of many script-based processors hurts the UX. They are 
> cumbersome to configure, with script files and/or script bodies. They result 
> in a dataflow that's difficult to understand because instead of nicely named 
> processors like CompressContent the type and default name are 
> "ExecuteScript." They're also difficult to share.
> I have been playing with Py4J for introduce a true Python-based API for 
> developing Processors. This will introduce new APIs, new framework changes, 
> and documentation. And this will likely take a while to stabilize. However, 
> the sooner that we are able to land it into the hands of users, the better. 
> Therefore, I pose that we introduce it in multiple milestones. We can create 
> sub-tickets for different milestones, but in general it should follow:
> Milestone 1: Initial implementation. Provides the capability and an API for 
> building processors. Includes sample code and some documentation. Includes 
> tests to ensure proper operation. Should not be used in production. API will 
> not be stable and may change frequently. Performance may be subpar. Get into 
> the hands of developers to begin exploring and providing feedback / 
> submitting PRs.
> Milestone 2: Bug fixes. API refinement. Improve performance.
> Milestone 3: Additional bug fixes and API refinement. API should become more 
> stable.
> Milestone 4: Additional bug fixes. API becomes stable. Documentation is clear 
> and sufficient. Recommend production use.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (NIFI-11240) Introduce Python API for building Processors

2023-03-02 Thread Otto Fowler (Jira)


[ 
https://issues.apache.org/jira/browse/NIFI-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695866#comment-17695866
 ] 

Otto Fowler commented on NIFI-11240:


We should include a cookie cutter template creating projects.


> Introduce Python API for building Processors
> 
>
> Key: NIFI-11240
> URL: https://issues.apache.org/jira/browse/NIFI-11240
> Project: Apache NiFi
>  Issue Type: Epic
>  Components: Core Framework, Documentation  Website, Extensions
>Reporter: Mark Payne
>Assignee: Mark Payne
>Priority: Major
> Fix For: 2.0.0
>
>
> The scripting processors are very common for data transformation in NiFi. In 
> particular, the Jython based scripts are quite heavily used. However, Jython 
> is run on the JVM and does not support CPython libraries. As a result, it's 
> syntax compatible but doesn't make use of the wealth of Python libraries. And 
> the wealth of Python libraries are what make Python popular to begin with.
> Additionally, use of many script-based processors hurts the UX. They are 
> cumbersome to configure, with script files and/or script bodies. They result 
> in a dataflow that's difficult to understand because instead of nicely named 
> processors like CompressContent the type and default name are 
> "ExecuteScript." They're also difficult to share.
> I have been playing with Py4J for introduce a true Python-based API for 
> developing Processors. This will introduce new APIs, new framework changes, 
> and documentation. And this will likely take a while to stabilize. However, 
> the sooner that we are able to land it into the hands of users, the better. 
> Therefore, I pose that we introduce it in multiple milestones. We can create 
> sub-tickets for different milestones, but in general it should follow:
> Milestone 1: Initial implementation. Provides the capability and an API for 
> building processors. Includes sample code and some documentation. Includes 
> tests to ensure proper operation. Should not be used in production. API will 
> not be stable and may change frequently. Performance may be subpar. Get into 
> the hands of developers to begin exploring and providing feedback / 
> submitting PRs.
> Milestone 2: Bug fixes. API refinement. Improve performance.
> Milestone 3: Additional bug fixes and API refinement. API should become more 
> stable.
> Milestone 4: Additional bug fixes. API becomes stable. Documentation is clear 
> and sufficient. Recommend production use.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)