[GitHub] [nifi] markap14 commented on pull request #7003: NIFI-11241: Initial implementation of Python-based Processor API with…
markap14 commented on PR #7003: URL: https://github.com/apache/nifi/pull/7003#issuecomment-1507012164 Thanks for the thorough review @exceptionfactory . I've pushed a couple new commits that should address the concerns raised (except for the few that I responded to directly). Or I created Jiras to address them. For example, https://issues.apache.org/jira/browse/NIFI-11448 to speed up the adding of a Python Processor to the canvas and https://issues.apache.org/jira/browse/NIFI-11446 for better handling of the case when a Python process dies. As for the dependency management - I am ok with a follow-on activity that allows inclusion of a `requirements.txt` but I definitely don't think we should eliminate the ability to define the requirements within the Processor itself. This is actually not an unheard-of practice. I actually implemented it that way based on what Apache Airflow allows for: https://airflow.apache.org/docs/apache-airflow/stable/howto/operator/python.html - this is an example of using their @task.virtualenv decorator: ``` @task.virtualenv( task_id="virtualenv_python", requirements=["colorama==0.4.0"], system_site_packages=False ) ``` I think this makes great sense. While in some cases a `requirements.txt` could perhaps be more appropriate, providing this inline makes great sense for a large number of use cases. For example, for a user who develops a custom Processor. If they want to share that Processor with others in their organization, they should able to just send their python file. It should not require some external requirements.txt file - doing so means they'd have to zip up a directory. And then the person receiving it would have to unzip the file, would have to know where to unzip it, what directories are included in the zip, etc. It's very doable but makes everything far more complicated. Otherwise, I think the newest commits and noted Jiras cover all of your feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] markap14 commented on pull request #7003: NIFI-11241: Initial implementation of Python-based Processor API with…
markap14 commented on PR #7003: URL: https://github.com/apache/nifi/pull/7003#issuecomment-1479640307 Pushed a commit that I think addresses all of your mentioned concerns thus far @exceptionfactory and rebased against main in order to resolve conflict in StandardProcessScheduler. Force-pushed due to the rebase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] markap14 commented on pull request #7003: NIFI-11241: Initial implementation of Python-based Processor API with…
markap14 commented on PR #7003: URL: https://github.com/apache/nifi/pull/7003#issuecomment-1476971571 @dam4rus regarding the creation of the environment / importing of the libraries - the environment is created before importing the libraries. The Environment creation happens in `PythonProcess.java` in the `setupEnvironment()` method. It then runs `Controller.py` from the `launchPythonProcess` method. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nifi] markap14 commented on pull request #7003: NIFI-11241: Initial implementation of Python-based Processor API with…
markap14 commented on PR #7003: URL: https://github.com/apache/nifi/pull/7003#issuecomment-1476969983 Thanks @dam4rus for the thorough review. As you might have guessed, I am not a Python expert. A lot of the suggestions that you make probably make a lot of sense. But they are largely not something that we want to do in this PR. Specifically, the idea of this PR is to make something available so that others who are more familiar with Python can start to iterate on it, and users can begin to experiment with the API. I aimed to make it very clear that this is not production ready through the documentation, etc. But we don't want to iterate on all the minor findings that occur before introducing this. By bringing it into the 2.0 codebase early I'm trying to get others who are more experienced in the Python side excited and iterate quickly to get to a production-ready capability. So I think most of your comments make sense but should be done in subsequent follow-on PRs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org