[GitHub] [nifi] markap14 commented on pull request #7003: NIFI-11241: Initial implementation of Python-based Processor API with…

2023-04-13 Thread via GitHub


markap14 commented on PR #7003:
URL: https://github.com/apache/nifi/pull/7003#issuecomment-1507012164

   Thanks for the thorough review @exceptionfactory . I've pushed a couple new 
commits that should address the concerns raised (except for the few that I 
responded to directly). Or I created Jiras to address them. For example, 
https://issues.apache.org/jira/browse/NIFI-11448 to speed up the adding of a 
Python Processor to the canvas and 
https://issues.apache.org/jira/browse/NIFI-11446 for better handling of the 
case when a Python process dies.
   
   As for the dependency management - I am ok with a follow-on activity that 
allows inclusion of a `requirements.txt` but I definitely don't think we should 
eliminate the ability to define the requirements within the Processor itself. 
This is actually not an unheard-of practice. I actually implemented it that way 
based on what Apache Airflow allows for: 
https://airflow.apache.org/docs/apache-airflow/stable/howto/operator/python.html
 - this is an example of using their @task.virtualenv decorator:
   ```
   @task.virtualenv(
   task_id="virtualenv_python", requirements=["colorama==0.4.0"], 
system_site_packages=False
   )
   ```
   I think this makes great sense. While in some cases a `requirements.txt` 
could perhaps be more appropriate, providing this inline makes great sense for 
a large number of use cases. For example, for a user who develops a custom 
Processor. If they want to share that Processor with others in their 
organization, they should able to just send their python file. It should not 
require some external requirements.txt file - doing so means they'd have to zip 
up a directory. And then the person receiving it would have to unzip the file, 
would have to know where to unzip it, what directories are included in the zip, 
etc. It's very doable but makes everything far more complicated.
   
   Otherwise, I think the newest commits and noted Jiras cover all of your 
feedback.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [nifi] markap14 commented on pull request #7003: NIFI-11241: Initial implementation of Python-based Processor API with…

2023-03-22 Thread via GitHub


markap14 commented on PR #7003:
URL: https://github.com/apache/nifi/pull/7003#issuecomment-1479640307

   Pushed a commit that I think addresses all of your mentioned concerns thus 
far  @exceptionfactory  and rebased against main in order to resolve conflict 
in StandardProcessScheduler. Force-pushed due to the rebase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [nifi] markap14 commented on pull request #7003: NIFI-11241: Initial implementation of Python-based Processor API with…

2023-03-20 Thread via GitHub


markap14 commented on PR #7003:
URL: https://github.com/apache/nifi/pull/7003#issuecomment-1476971571

   @dam4rus regarding the creation of the environment / importing of the 
libraries - the environment is created before importing the libraries. The 
Environment creation happens in `PythonProcess.java` in the 
`setupEnvironment()` method. It then runs `Controller.py` from the 
`launchPythonProcess` method.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [nifi] markap14 commented on pull request #7003: NIFI-11241: Initial implementation of Python-based Processor API with…

2023-03-20 Thread via GitHub


markap14 commented on PR #7003:
URL: https://github.com/apache/nifi/pull/7003#issuecomment-1476969983

   Thanks @dam4rus for the thorough review. As you might have guessed, I am not 
a Python expert. A lot of the suggestions that you make probably make a lot of 
sense. But they are largely not something that we want to do in this PR. 
Specifically, the idea of this PR is to make something available so that others 
who are more familiar with Python can start to iterate on it, and users can 
begin to experiment with the API.
   I aimed to make it very clear that this is not production ready through the 
documentation, etc. But we don't want to iterate on all the minor findings that 
occur before introducing this. By bringing it into the 2.0 codebase early I'm 
trying to get others who are more experienced in the Python side excited and 
iterate quickly to get to a production-ready capability.
   So I think most of your comments make sense but should be done in subsequent 
follow-on PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org