Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
fgerlits closed pull request #1712: MINIFICPP-2276 Support FlowFileTransform NiFi Python processors URL: https://github.com/apache/nifi-minifi-cpp/pull/1712 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1507614295 ## extensions/python/ExecutePythonProcessor.cpp: ## @@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() { try { loadScript(); - } catch(const std::runtime_error&) { -logger_->log_warn("Could not load python script while initializing. In case of non-native python processor this is normal and will be done in the schedule phase."); + } catch(const std::runtime_error& err) { Review Comment: Updated in 6d382e80a43d2c4de48091dff2e91e7b951edfb1 ## extensions/python/PythonScriptEngine.cpp: ## @@ -180,4 +198,66 @@ void PythonScriptEngine::evaluateModuleImports() { } } +void PythonScriptEngine::initializeProcessorObject(const std::string& python_class_name) { + GlobalInterpreterLock gil; + if (auto python_class = bindings_[python_class_name]) { +auto num_args = [&]() -> size_t { + auto class_init = OwnedObject(PyObject_GetAttrString(python_class->get(), "__init__")); + if (!class_init.get()) { +return 0; + } + + auto inspect_module = OwnedObject(PyImport_ImportModule("inspect")); + if (!inspect_module.get()) { +return 0; + } + + auto inspect_args = OwnedObject(PyObject_CallMethod(inspect_module.get(), "getfullargspec", "O", class_init.get())); + if (!inspect_args.get()) { +return 0; + } + + auto arg_list = OwnedObject(PyObject_GetAttrString(inspect_args.get(), "args")); + if (!arg_list.get()) { +return 0; + } + + return PyList_Size(arg_list.get()); +}(); + +if (num_args > 1) { + auto kwargs = OwnedDict::create(); + auto value = OwnedObject(Py_None); + kwargs.put("jvm", value); + auto args = OwnedObject(PyTuple_New(0)); + processor_instance_ = OwnedObject(PyObject_Call(python_class->get(), args.get(), kwargs.get())); +} else { + processor_instance_ = OwnedObject(PyObject_CallObject(python_class->get(), nullptr)); +} + +if (processor_instance_.get() == nullptr) { + throw PythonScriptException(PyException().what()); +} + +auto result = PyObject_SetAttrString(processor_instance_.get(), "logger", bindings_[std::string("log")]->get()); Review Comment: Good point about the interface, it should definitely be changed. Currently the issue is due to this ambiguity build error: ``` /home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.cpp:242:88: error: ambiguous overload for ‘operator[]’ (operand types are ‘org::apache::nifi::minifi::extensions::python::OwnedDict’ {aka ‘org::apache::nifi::minifi::extensions::python::Dict’} and ‘const char [4]’) 242 | auto result = PyObject_SetAttrString(processor_instance_.get(), "logger", bindings_["log"]->get()); | ^ /home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.cpp:242:88: note: candidate: ‘operator[](long int, const char*)’ (built-in) In file included from /home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonBindings.h:24, from /home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.h:20, from /home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.cpp:21: /home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/types/Types.h:290:36: note: candidate: ‘std::optional > org::apache::nifi::minifi::extensions::python::Dict::operator[](std::string_view) [with org::apache::nifi::minifi::extensions::python::ReferenceType reference_type = org::apache::nifi::minifi::extensions::python::ReferenceType::OWNED; std::string_view = std::basic_string_view]’ 290 | std::optional operator[](std::string_view key) { |^~~~ /home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.cpp:246:88: error: ambiguous overload for ‘operator[]’ (operand types are ‘org::apache::nifi::minifi::extensions::python::OwnedDict’ {aka ‘org::apache::nifi::minifi::extensions::python::Dict’} and ‘const char [12]’) ``` It doesn't need to be `std::string` but it still needs to explicitly define the parameter type due to this. I changed it to std::string_view which is the expected type currently in 6d382e80a43d2c4de48091dff2e91e7b951edfb1 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
szaszm commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1505967478 ## extensions/python/ExecutePythonProcessor.cpp: ## @@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() { try { loadScript(); - } catch(const std::runtime_error&) { -logger_->log_warn("Could not load python script while initializing. In case of non-native python processor this is normal and will be done in the schedule phase."); + } catch(const std::runtime_error& err) { Review Comment: I'd keep this unnamed as long as it's not referenced. ```suggestion } catch(const std::runtime_error&) { ``` ## extensions/python/PythonScriptEngine.cpp: ## @@ -180,4 +198,66 @@ void PythonScriptEngine::evaluateModuleImports() { } } +void PythonScriptEngine::initializeProcessorObject(const std::string& python_class_name) { + GlobalInterpreterLock gil; + if (auto python_class = bindings_[python_class_name]) { +auto num_args = [&]() -> size_t { + auto class_init = OwnedObject(PyObject_GetAttrString(python_class->get(), "__init__")); + if (!class_init.get()) { +return 0; + } + + auto inspect_module = OwnedObject(PyImport_ImportModule("inspect")); + if (!inspect_module.get()) { +return 0; + } + + auto inspect_args = OwnedObject(PyObject_CallMethod(inspect_module.get(), "getfullargspec", "O", class_init.get())); + if (!inspect_args.get()) { +return 0; + } + + auto arg_list = OwnedObject(PyObject_GetAttrString(inspect_args.get(), "args")); + if (!arg_list.get()) { +return 0; + } + + return PyList_Size(arg_list.get()); +}(); + +if (num_args > 1) { + auto kwargs = OwnedDict::create(); + auto value = OwnedObject(Py_None); + kwargs.put("jvm", value); + auto args = OwnedObject(PyTuple_New(0)); + processor_instance_ = OwnedObject(PyObject_Call(python_class->get(), args.get(), kwargs.get())); +} else { + processor_instance_ = OwnedObject(PyObject_CallObject(python_class->get(), nullptr)); +} + +if (processor_instance_.get() == nullptr) { + throw PythonScriptException(PyException().what()); +} + +auto result = PyObject_SetAttrString(processor_instance_.get(), "logger", bindings_[std::string("log")]->get()); Review Comment: Why are the keys wrapped in `std::string`? String literals are `const char*` and are already null-terminated, so if that's the only reason, I think they should be just raw literals. The interface is confusing and dangerous, taking `std::string_view` when it needs a null-terminated string internally, and we should change that in my opinion, but not in this PR. Created Jira for this, but let me know if I'm missing something about this: [MINIFICPP-2308](https://issues.apache.org/jira/browse/MINIFICPP-2308) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500838899 ## extensions/python/ExecutePythonProcessor.h: ## @@ -97,10 +99,8 @@ class ExecutePythonProcessor : public core::Processor { python_dynamic_ = true; } - void addProperty(const std::string , const std::string , const std::string , bool required, bool el) { -python_properties_.emplace_back( - core::PropertyDefinitionBuilder<>::createProperty(name).withDefaultValue(defaultvalue).withDescription(description).isRequired(required).supportsExpressionLanguage(el).build()); - } + void addProperty(const std::string , const std::string , const std::optional , bool required, bool el, + bool sensitive, const std::optional& property_type_code); const std::vector () const { return python_properties_; Review Comment: Good point, updated in d3f47c57d21829d1715bd794f8c08a462f4fa361 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
fgerlits commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500803117 ## extensions/python/ExecutePythonProcessor.h: ## @@ -97,10 +99,8 @@ class ExecutePythonProcessor : public core::Processor { python_dynamic_ = true; } - void addProperty(const std::string , const std::string , const std::string , bool required, bool el) { -python_properties_.emplace_back( - core::PropertyDefinitionBuilder<>::createProperty(name).withDefaultValue(defaultvalue).withDescription(description).isRequired(required).supportsExpressionLanguage(el).build()); - } + void addProperty(const std::string , const std::string , const std::optional , bool required, bool el, + bool sensitive, const std::optional& property_type_code); const std::vector () const { return python_properties_; Review Comment: I think we should acquire the mutex here, too, and return a copy. The copy will happen anyway at the calling site in `PythonCreator::registerScriptDescription`; locking the mutex adds some overhead, but it is necessary. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500753007 ## extensions/python/ExecutePythonProcessor.cpp: ## @@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() { try { loadScript(); - } catch(const std::runtime_error&) { -logger_->log_warn("Could not load python script while initializing. In case of non-native python processor this is normal and will be done in the schedule phase."); Review Comment: Sure, added jira ticket: https://issues.apache.org/jira/browse/MINIFICPP-2304 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
fgerlits commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500714476 ## extensions/python/ExecutePythonProcessor.cpp: ## @@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() { try { loadScript(); - } catch(const std::runtime_error&) { -logger_->log_warn("Could not load python script while initializing. In case of non-native python processor this is normal and will be done in the schedule phase."); Review Comment: I see, thanks! Can you create a Jira for clearing this up later? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500709305 ## extensions/python/types/Types.h: ## @@ -160,7 +160,18 @@ class Long : public ReferenceHolder { } int64_t asInt64() { -return static_cast(PyLong_AsLongLong(this->ref_.get())); +auto long_value = PyLong_AsLongLong(this->ref_.get()); +if (long_value == -1 && PyErr_Occurred()) { + throw PyException(); +} +return static_cast(long_value); Review Comment: Updated in 6593d9b63514dd2a3b2060b747680f37691abecd ## libminifi/src/core/ConfigurableComponent.cpp: ## @@ -36,19 +36,28 @@ ConfigurableComponent::ConfigurableComponent() ConfigurableComponent::~ConfigurableComponent() = default; +Property* ConfigurableComponent::findProperty(const std::string& name) const { Review Comment: Updated in 6593d9b63514dd2a3b2060b747680f37691abecd -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500708936 ## extensions/python/ExecutePythonProcessor.cpp: ## @@ -146,11 +151,38 @@ std::unique_ptr ExecutePythonProcessor::createScriptEngine() auto engine = std::make_unique(); python_logger_ = core::logging::LoggerFactory::getAliasedLogger(getName()); - engine->initialize(Success, Failure, python_logger_); + engine->initialize(Success, Failure, Original, python_logger_); return engine; } +core::Property* ExecutePythonProcessor::findProperty(const std::string& name) const { + if (auto prop_ptr = core::ConfigurableComponent::findProperty(name)) { +return prop_ptr; + } + + auto it = ranges::find_if(python_properties_, [](const auto& item){ +return item.getName() == name; + }); + if (it != python_properties_.end()) { +return const_cast(&*it); + } + + return nullptr; +} + +std::map ExecutePythonProcessor::getProperties() const { + auto result = ConfigurableComponent::getProperties(); + + std::lock_guard lock(configuration_mutex_); Review Comment: You are right, it should be a separate mutex for the `python_properties_`, it didn't really make sense this way. Updated in 6593d9b63514dd2a3b2060b747680f37691abecd ## extensions/python/PYTHON.md: ## @@ -106,20 +127,39 @@ class VaderSentiment(object): To enable python Processor capabilities, the following options need to be provided in minifi.properties. The directory specified can contain processors. Note that the processor name will be the reference in your flow. Directories are treated like package names. Therefore if the nifi.python.processor.dir is /tmp/ and you have a subdirectory named packagedir with the file name file.py, it will -produce a processor with the name org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory will append a package -to the reference class name. +produce a processor with the name org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory will append a package +to the reference class name. in minifi.properties #directory where processors exist nifi.python.processor.dir= - - + + ## Processors The python directory (extensions/pythonprocessors) contains implementations that will be available for flows if the required dependencies exist. - -## Sentiment Analysis + +### Sentiment Analysis The SentimentAnalysis processor will perform a Vader Sentiment Analysis. This requires that you install nltk and VaderSentiment pip install nltk pip install VaderSentiment + +## Using NiFi Python Processors + +MiNiFi C++ supports the use of NiFi Python processors, that are inherited from the FlowFileTransform base class. To use these processors, you must copy the Python processor module under the nifi_python_processors directory located under the python directory (by default it can be found at ${minifi_root}/minifi-python/nifi_python_processors). To see how to write NiFi Python processors, please refer to the Python Developer Guide under the [Apache NiFi documentation](https://nifi.apache.org/documentation/v2/). + +In the flow configuration these Python processors can be referenced by their fully qualified class name, which looks like this: org.apache.nifi.minifi.processors.nifi_python_processors... For example, the fully qualified class name of the PromptChatGPT processor implemented in the file nifi_python_processors/PromptChatGPT.py is org.apache.nifi.minifi.processors.nifi_python_processors.PromptChatGPT. If a processor is copied under a subdirectory, because it is part of a python submodule, the submodule name will be appended to the fully qualified class name. For example, if the QueryPinecone processor is implemented in the QueryPinecone.py file that is copied to nifi_python_processors/vectorstores/QueryPinecone.py, the fully qualified class name will be org.apache.nifi.minifi.processors.nifi_python_processors.vectorstores.QueryPinecone in the configuration file. + +**NOTE:** The name of the NiFi Python processor file should match the class name in the file, otherwise the processor will not be found. + +Due to some differences between the NiFi and MiNiFi C++ processors and implementation, there are some limitations using the NiFi Python processors: +- Record based processors are not yet supported in MiNiFi C++, so the NiFi Python processors inherited from RecordTransform are not supported. +- Virtualenv support is not yet available in MiNiFi C++, so all required packaged must be installed on the system. Review Comment: Updated in 6593d9b63514dd2a3b2060b747680f37691abecd ## extensions/python/PYTHON.md: ## @@ -106,20 +127,39 @@ class VaderSentiment(object): To enable python Processor capabilities, the following options need to be provided in minifi.properties. The
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500664052 ## extensions/python/types/PyProcessContext.cpp: ## @@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* self, PyObject* args) return nullptr; } - const char* property; - if (!PyArg_ParseTuple(args, "s", )) { + const char* property_name = nullptr; + PyObject* script_flow_file = nullptr; + if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) { throw PyException(); } + std::string value; - context->getProperty(property, value); + if (!script_flow_file) { +if (!context->getProperty(property_name, value)) { + Py_RETURN_NONE; +} + } else { +auto py_flow = reinterpret_cast(script_flow_file); +const auto flow_file = py_flow->script_flow_file_.lock(); +if (!flow_file) { + PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 'on_trigger'"); + return nullptr; Review Comment: Py_RETURN_NONE returns a Python object with the Python `None` value which can be a valid return value in some cases, while the returning `nullptr` indicates an error, and python throws an exception if nullptr is returned from the C API with the error set in the `PyErr_SetString`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500637729 ## extensions/python/PythonScriptEngine.cpp: ## @@ -180,4 +198,66 @@ void PythonScriptEngine::evaluateModuleImports() { } } +void PythonScriptEngine::initializeProcessorObject(const std::string& python_class_name) { + GlobalInterpreterLock gil; + if (auto python_class = bindings_[python_class_name]) { +auto num_args = [&]() -> size_t { + auto class_init = OwnedObject(PyObject_GetAttrString(python_class->get(), "__init__")); + if (!class_init.get()) { +return 0; + } + + auto inspect_module = OwnedObject(PyImport_ImportModule("inspect")); + if (!inspect_module.get()) { +return 0; + } + + auto inspect_args = OwnedObject(PyObject_CallMethod(inspect_module.get(), "getfullargspec", "O", class_init.get())); + if (!inspect_args.get()) { +return 0; + } + + auto arg_list = OwnedObject(PyObject_GetAttrString(inspect_args.get(), "args")); + if (!arg_list.get()) { +return 0; + } + + return PyList_Size(arg_list.get()); +}(); + +if (num_args > 1) { + auto kwargs = OwnedDict::create(); + auto value = OwnedObject(Py_None); + kwargs.put("jvm", value); + auto args = OwnedObject(PyTuple_New(0)); Review Comment: In NiFi there are scenarios when the `jvm` object is passed to the constructor, either as an arg or as a named kwarg and we only expect this one optional argument. We only want to pass this as part of the kwargs argument of the constructor call of the Python processor object, but before that we need to pass the positional args as well. As we do not want to pass any positional args, the python C API expects us to pass a zero length tuple which is a `PyTuple_New(0)` object to the `PyObject_Call` function in this case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500602870 ## extensions/python/ExecutePythonProcessor.cpp: ## @@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() { try { loadScript(); - } catch(const std::runtime_error&) { -logger_->log_warn("Could not load python script while initializing. In case of non-native python processor this is normal and will be done in the schedule phase."); Review Comment: Unfortunately the python processors are instantiated twice. We have to instantiate the processors defined in the flow, but also we instantiate all python processors when building the manifest to get the description and the processor properties. In the latter scenario the initialize is called twice, once before setting the script file in the `PythonObjectFactory.h` (to initialize the supported properties), and in that case the loadScript() will always fail, because the ScriptFile property value is not set yet. As this will always happen the log is useless here. This double initialization is a bit wonky and should be worked out sometime in the future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
fgerlits commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1497594578 ## extensions/python/PYTHON.md: ## @@ -106,20 +127,39 @@ class VaderSentiment(object): To enable python Processor capabilities, the following options need to be provided in minifi.properties. The directory specified can contain processors. Note that the processor name will be the reference in your flow. Directories are treated like package names. Therefore if the nifi.python.processor.dir is /tmp/ and you have a subdirectory named packagedir with the file name file.py, it will -produce a processor with the name org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory will append a package -to the reference class name. +produce a processor with the name org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory will append a package +to the reference class name. in minifi.properties #directory where processors exist nifi.python.processor.dir= - - + + ## Processors The python directory (extensions/pythonprocessors) contains implementations that will be available for flows if the required dependencies exist. - -## Sentiment Analysis + +### Sentiment Analysis The SentimentAnalysis processor will perform a Vader Sentiment Analysis. This requires that you install nltk and VaderSentiment pip install nltk pip install VaderSentiment + +## Using NiFi Python Processors + +MiNiFi C++ supports the use of NiFi Python processors, that are inherited from the FlowFileTransform base class. To use these processors, you must copy the Python processor module under the nifi_python_processors directory located under the python directory (by default it can be found at ${minifi_root}/minifi-python/nifi_python_processors). To see how to write NiFi Python processors, please refer to the Python Developer Guide under the [Apache NiFi documentation](https://nifi.apache.org/documentation/v2/). + +In the flow configuration these Python processors can be referenced by their fully qualified class name, which looks like this: org.apache.nifi.minifi.processors.nifi_python_processors... For example, the fully qualified class name of the PromptChatGPT processor implemented in the file nifi_python_processors/PromptChatGPT.py is org.apache.nifi.minifi.processors.nifi_python_processors.PromptChatGPT. If a processor is copied under a subdirectory, because it is part of a python submodule, the submodule name will be appended to the fully qualified class name. For example, if the QueryPinecone processor is implemented in the QueryPinecone.py file that is copied to nifi_python_processors/vectorstores/QueryPinecone.py, the fully qualified class name will be org.apache.nifi.minifi.processors.nifi_python_processors.vectorstores.QueryPinecone in the configuration file. + +**NOTE:** The name of the NiFi Python processor file should match the class name in the file, otherwise the processor will not be found. + +Due to some differences between the NiFi and MiNiFi C++ processors and implementation, there are some limitations using the NiFi Python processors: +- Record based processors are not yet supported in MiNiFi C++, so the NiFi Python processors inherited from RecordTransform are not supported. +- Virtualenv support is not yet available in MiNiFi C++, so all required packaged must be installed on the system. Review Comment: ```suggestion - Virtualenv support is not yet available in MiNiFi C++, so all required packages must be installed on the system. ``` ## extensions/python/ExecutePythonProcessor.cpp: ## @@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() { try { loadScript(); - } catch(const std::runtime_error&) { -logger_->log_warn("Could not load python script while initializing. In case of non-native python processor this is normal and will be done in the schedule phase."); Review Comment: Is this log not useful at all? Maybe we could change it to a lower level (info or debug) instead of removing it. ## extensions/python/types/Types.h: ## @@ -160,7 +160,18 @@ class Long : public ReferenceHolder { } int64_t asInt64() { -return static_cast(PyLong_AsLongLong(this->ref_.get())); +auto long_value = PyLong_AsLongLong(this->ref_.get()); +if (long_value == -1 && PyErr_Occurred()) { + throw PyException(); +} +return static_cast(long_value); Review Comment: I may be overly paranoid, but I'd prefer `gsl::narrow` instead of `static_cast`. ## extensions/python/PYTHON.md: ## @@ -106,20 +127,39 @@ class VaderSentiment(object): To enable python Processor capabilities, the following options need to be provided in minifi.properties. The directory specified can contain processors. Note that the processor name will be the reference in
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
szaszm commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1486436629 ## extensions/python/types/PyProcessContext.cpp: ## @@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* self, PyObject* args) return nullptr; } - const char* property; - if (!PyArg_ParseTuple(args, "s", )) { + const char* property_name = nullptr; + PyObject* script_flow_file = nullptr; + if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) { throw PyException(); } + std::string value; - context->getProperty(property, value); + if (!script_flow_file) { +if (!context->getProperty(property_name, value)) { + Py_RETURN_NONE; +} + } else { +auto py_flow = reinterpret_cast(script_flow_file); +const auto flow_file = py_flow->script_flow_file_.lock(); +if (!flow_file) { + PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 'on_trigger'"); + return nullptr; +} Review Comment: No, I was only thinking about raw non-owning pointers instead of weak_ptr, but it looks like that's not feasible for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1483114670 ## extensions/python/types/PyProcessContext.cpp: ## @@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* self, PyObject* args) return nullptr; } - const char* property; - if (!PyArg_ParseTuple(args, "s", )) { + const char* property_name = nullptr; + PyObject* script_flow_file = nullptr; + if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) { throw PyException(); } + std::string value; - context->getProperty(property, value); + if (!script_flow_file) { +if (!context->getProperty(property_name, value)) { + Py_RETURN_NONE; +} + } else { +auto py_flow = reinterpret_cast(script_flow_file); +const auto flow_file = py_flow->script_flow_file_.lock(); +if (!flow_file) { + PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 'on_trigger'"); + return nullptr; +} Review Comment: Currently the held type of is `std::weak_ptr` in both `PyScriptFlowFile` and `PyScriptFlowFile`, where this is the way the held objects are handled in both files. We can refactor this in the future, but for the time being I think we should stick with using the same handling here as well, or is there anything else you meant by leaving the door open? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1483114670 ## extensions/python/types/PyProcessContext.cpp: ## @@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* self, PyObject* args) return nullptr; } - const char* property; - if (!PyArg_ParseTuple(args, "s", )) { + const char* property_name = nullptr; + PyObject* script_flow_file = nullptr; + if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) { throw PyException(); } + std::string value; - context->getProperty(property, value); + if (!script_flow_file) { +if (!context->getProperty(property_name, value)) { + Py_RETURN_NONE; +} + } else { +auto py_flow = reinterpret_cast(script_flow_file); +const auto flow_file = py_flow->script_flow_file_.lock(); +if (!flow_file) { + PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 'on_trigger'"); + return nullptr; +} Review Comment: Currently the held type of is `std::weak_ptr` in both `PyScriptFlowFile` and `PyScriptFlowFile`, where this is the way the held objects are handled in both files. We can refactor this in the future, but for the time being I think we should stick with using the same handling here as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1483106461 ## docker/test/integration/resources/minifi-c2-server/config-ssl.yml: ## @@ -0,0 +1,43 @@ +MiNiFi Config Version: 3 +Flow Controller: + name: MiNiFi Flow +Processors: +- name: Get files from /tmp/input + id: 2f2a3b47-f5ba-49f6-82b5-bc1c86b96e27 + class: org.apache.nifi.minifi.processors.GetFile + scheduling strategy: TIMER_DRIVEN + scheduling period: 1000 ms + Properties: +Input Directory: /tmp/input +- name: Put files to /tmp/output + id: e143601d-de4f-44ba-a6ec-d1f97d77ec94 + class: org.apache.nifi.minifi.processors.PutFile + scheduling strategy: EVENT_DRIVEN + auto-terminated relationships list: + - failure + - success + Properties: +Conflict Resolution Strategy: fail +Create Missing Directories: 'true' +Directory: /tmp/output +Connections: +- name: GetFile/success/PutFile + id: 098a56ba-f4bf-4323-a3f3-6f8a5e3586bf + source id: 2f2a3b47-f5ba-49f6-82b5-bc1c86b96e27 + source relationship names: + - success + destination id: e143601d-de4f-44ba-a6ec-d1f97d77ec94 +Controller Services: + - name: SSLContextService +id: 2438e3c8-015a-1000-79ca-83af40ec1994 +class: SSLContextService Review Comment: This is only used for C2, and this is only the config that is sent from the C2 server after the update. I actually just reverted the json files to the previously used yamls files for these tests and didn't check them. There are separate tests for using the controller service and using the ssl properties in the minifi.properties file. But we can actually remove this file, as we only check if the flow update succeeds and we don't care about the C2 communication afterwards so we can use the same config without the SSL context. Updated in 3d207dca7c2cdf1f86f4b6308a8c02addb2a852d -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
szaszm commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1479706920 ## docker/test/integration/resources/minifi-c2-server/config-ssl.yml: ## @@ -0,0 +1,43 @@ +MiNiFi Config Version: 3 +Flow Controller: + name: MiNiFi Flow +Processors: +- name: Get files from /tmp/input + id: 2f2a3b47-f5ba-49f6-82b5-bc1c86b96e27 + class: org.apache.nifi.minifi.processors.GetFile + scheduling strategy: TIMER_DRIVEN + scheduling period: 1000 ms + Properties: +Input Directory: /tmp/input +- name: Put files to /tmp/output + id: e143601d-de4f-44ba-a6ec-d1f97d77ec94 + class: org.apache.nifi.minifi.processors.PutFile + scheduling strategy: EVENT_DRIVEN + auto-terminated relationships list: + - failure + - success + Properties: +Conflict Resolution Strategy: fail +Create Missing Directories: 'true' +Directory: /tmp/output +Connections: +- name: GetFile/success/PutFile + id: 098a56ba-f4bf-4323-a3f3-6f8a5e3586bf + source id: 2f2a3b47-f5ba-49f6-82b5-bc1c86b96e27 + source relationship names: + - success + destination id: e143601d-de4f-44ba-a6ec-d1f97d77ec94 +Controller Services: + - name: SSLContextService +id: 2438e3c8-015a-1000-79ca-83af40ec1994 +class: SSLContextService Review Comment: Is this used for C2? Because the flow doesn't use it. If it's C2, I think it would be nicer to include a reference to C2 in its name, or configure SSL in the config properties in minifi.properties. ## extensions/python/types/PyProcessContext.cpp: ## @@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* self, PyObject* args) return nullptr; } - const char* property; - if (!PyArg_ParseTuple(args, "s", )) { + const char* property_name = nullptr; + PyObject* script_flow_file = nullptr; + if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) { throw PyException(); } + std::string value; - context->getProperty(property, value); + if (!script_flow_file) { +if (!context->getProperty(property_name, value)) { + Py_RETURN_NONE; +} + } else { +auto py_flow = reinterpret_cast(script_flow_file); +const auto flow_file = py_flow->script_flow_file_.lock(); +if (!flow_file) { + PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 'on_trigger'"); + return nullptr; +} Review Comment: We should leave the door open for a future refactoring of flow file ownership, where onTrigger calls have non-owning references to flow files, without shared_ptr usage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1482851080 ## extensions/python/PYTHON.md: ## @@ -106,20 +127,39 @@ class VaderSentiment(object): To enable python Processor capabilities, the following options need to be provided in minifi.properties. The directory specified can contain processors. Note that the processor name will be the reference in your flow. Directories are treated like package names. Therefore if the nifi.python.processor.dir is /tmp/ and you have a subdirectory named packagedir with the file name file.py, it will -produce a processor with the name org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory will append a package -to the reference class name. +produce a processor with the name org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory will append a package +to the reference class name. in minifi.properties #directory where processors exist nifi.python.processor.dir= - - + + ## Processors The python directory (extensions/pythonprocessors) contains implementations that will be available for flows if the required dependencies exist. - -## Sentiment Analysis + +### Sentiment Analysis The SentimentAnalysis processor will perform a Vader Sentiment Analysis. This requires that you install nltk and VaderSentiment pip install nltk pip install VaderSentiment + +## Using NiFi Python Processors + +MiNiFi C++ supports the use of NiFi Python processors, that are inherited from the FlowFileTransform base class. To use these processors, you must copy the Python processor module under the nifi_python_processors directory located under the python directory (by default it can be found at ${minifi_root}/minifi-python/nifi_python_processors). To see how to to write NiFi Python processors, please refer to the Python Developer Guide under the [Apache NiFi documentation](https://nifi.apache.org/documentation/v2/). Review Comment: Fixed in fc069e05f898ca1b70414d26b325b3e6af1456c0 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
adamdebreceni commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1481497293 ## extensions/python/PYTHON.md: ## @@ -106,20 +127,39 @@ class VaderSentiment(object): To enable python Processor capabilities, the following options need to be provided in minifi.properties. The directory specified can contain processors. Note that the processor name will be the reference in your flow. Directories are treated like package names. Therefore if the nifi.python.processor.dir is /tmp/ and you have a subdirectory named packagedir with the file name file.py, it will -produce a processor with the name org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory will append a package -to the reference class name. +produce a processor with the name org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory will append a package +to the reference class name. in minifi.properties #directory where processors exist nifi.python.processor.dir= - - + + ## Processors The python directory (extensions/pythonprocessors) contains implementations that will be available for flows if the required dependencies exist. - -## Sentiment Analysis + +### Sentiment Analysis The SentimentAnalysis processor will perform a Vader Sentiment Analysis. This requires that you install nltk and VaderSentiment pip install nltk pip install VaderSentiment + +## Using NiFi Python Processors + +MiNiFi C++ supports the use of NiFi Python processors, that are inherited from the FlowFileTransform base class. To use these processors, you must copy the Python processor module under the nifi_python_processors directory located under the python directory (by default it can be found at ${minifi_root}/minifi-python/nifi_python_processors). To see how to to write NiFi Python processors, please refer to the Python Developer Guide under the [Apache NiFi documentation](https://nifi.apache.org/documentation/v2/). Review Comment: "how to to write NiFi" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1467516253 ## extensions/python/ExecutePythonProcessor.h: ## @@ -118,7 +126,30 @@ class ExecutePythonProcessor : public core::Processor { return description_; } + void setPythonClassName(const std::string& python_class_name) { +python_class_name_ = python_class_name; + } + + void setPythonPaths(const std::vector& python_paths) { +python_paths_ = python_paths; + } + + protected: + core::Property* findProperty(const std::string& name) const override; + private: + enum class PropertyTypeCode : int64_t { +INTEGER_TYPE = 0, +LONG_TYPE = 1, +BOOLEAN_TYPE = 2, +DATA_SIZE_TYPE = 3, +TIME_PERIOD_TYPE = 4, +NON_BLANK_TYPE = 5, +PORT_TYPE = 6 + }; Review Comment: Updated in ee23065b91cd8ba88fb7db2dfeaa5c1cb31dae55 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1465177492 ## extensions/python/ExecutePythonProcessor.h: ## @@ -118,7 +126,30 @@ class ExecutePythonProcessor : public core::Processor { return description_; } + void setPythonClassName(const std::string& python_class_name) { +python_class_name_ = python_class_name; + } + + void setPythonPaths(const std::vector& python_paths) { +python_paths_ = python_paths; + } + + protected: + core::Property* findProperty(const std::string& name) const override; + private: + enum class PropertyTypeCode : int64_t { +INTEGER_TYPE = 0, +LONG_TYPE = 1, +BOOLEAN_TYPE = 2, +DATA_SIZE_TYPE = 3, +TIME_PERIOD_TYPE = 4, +NON_BLANK_TYPE = 5, +PORT_TYPE = 6 + }; Review Comment: This is actually a reference to our `StandardPropertyTypes` found in MiNiFi defined in the `libminifi/include/core/PropertyType.h` header, but we can remove the suffix and I suppose it's better to have it as part of the `core::PropertyType` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1465177492 ## extensions/python/ExecutePythonProcessor.h: ## @@ -118,7 +126,30 @@ class ExecutePythonProcessor : public core::Processor { return description_; } + void setPythonClassName(const std::string& python_class_name) { +python_class_name_ = python_class_name; + } + + void setPythonPaths(const std::vector& python_paths) { +python_paths_ = python_paths; + } + + protected: + core::Property* findProperty(const std::string& name) const override; + private: + enum class PropertyTypeCode : int64_t { +INTEGER_TYPE = 0, +LONG_TYPE = 1, +BOOLEAN_TYPE = 2, +DATA_SIZE_TYPE = 3, +TIME_PERIOD_TYPE = 4, +NON_BLANK_TYPE = 5, +PORT_TYPE = 6 + }; Review Comment: This is actually a reference to our `StandardPropertyTypes` found in MiNiFi defined in the `libminifi/include/core/PropertyType.h`, but we can remove the suffix and I suppose it's better to have it as part of the `core::PropertyType` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
szaszm commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1463510235 ## extensions/python/ExecutePythonProcessor.h: ## @@ -118,7 +126,30 @@ class ExecutePythonProcessor : public core::Processor { return description_; } + void setPythonClassName(const std::string& python_class_name) { +python_class_name_ = python_class_name; + } + + void setPythonPaths(const std::vector& python_paths) { +python_paths_ = python_paths; + } + + protected: + core::Property* findProperty(const std::string& name) const override; + private: + enum class PropertyTypeCode : int64_t { +INTEGER_TYPE = 0, +LONG_TYPE = 1, +BOOLEAN_TYPE = 2, +DATA_SIZE_TYPE = 3, +TIME_PERIOD_TYPE = 4, +NON_BLANK_TYPE = 5, +PORT_TYPE = 6 + }; Review Comment: Is the naming directly coming from NiFi? If not, I'd drop the `_TYPE` suffix of the alternatives. I also think we should consider moving this type and the translate function near `core::PropertyType`, since it's related, general, and not dependent on anything. A reference to NiFi code or docs where these codes are defined would also be nice, as a code comment. ## libminifi/include/core/ConfigurableComponent.h: ## @@ -212,6 +212,8 @@ class ConfigurableComponent { // Dynamic properties std::map dynamic_properties_; + virtual Property* findProperty(const std::string& name) const; + Review Comment: I was under the impression that `getProperty` was already `virtual`, but apparently that's not the case, so this new customization point makes sense knowing this. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1463029693 ## docker/test/integration/features/environment.py: ## @@ -47,6 +47,13 @@ def before_scenario(context, scenario): logging.info("Integration test setup at {time:%H:%M:%S.%f}".format(time=datetime.datetime.now())) context.test = MiNiFi_integration_test(context=context, feature_id=context.feature_id) + +if "USE_NIFI_PYTHON_PROCESSORS" in scenario.effective_tags: +if not context.image_store.is_conda_available_in_minifi_image() and context.image_store.get_minifi_image_python_version() < (3, 8, 1): +scenario.skip("NiFi Python processor tests use langchain library therefore Python 3.8.1 or later is needed in the MiNiFi container.") Review Comment: Updated in e8757f9c370b2ec92866a3731af1e6d9ae0edb74 ## libminifi/include/core/ProcessContext.h: ## @@ -135,6 +135,10 @@ class ProcessContext : public controller::ControllerServiceLookup, public core:: return getProperty(property.name, value); } + virtual bool getProperty(bool /*supports_expression_language*/, std::string_view property_name, std::string& value, const std::shared_ptr& /*flow_file*/) { +return getProperty(property_name, value); + } Review Comment: Updated in e8757f9c370b2ec92866a3731af1e6d9ae0edb74 ## extensions/python/PythonProcessor.h: ## @@ -37,7 +38,7 @@ class PythonProcessor { void setDescription(const std::string& desc); - void addProperty(const std::string& name, const std::string& description, const std::string& defaultvalue, bool required, bool el); + void addProperty(const std::string& name, const std::string& description, const std::optional& defaultvalue, bool required, bool el, const std::optional& validator_value); Review Comment: Updated in e8757f9c370b2ec92866a3731af1e6d9ae0edb74 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1459202019 ## libminifi/include/core/ProcessContext.h: ## @@ -135,6 +135,10 @@ class ProcessContext : public controller::ControllerServiceLookup, public core:: return getProperty(property.name, value); } + virtual bool getProperty(bool /*supports_expression_language*/, std::string_view property_name, std::string& value, const std::shared_ptr& /*flow_file*/) { +return getProperty(property_name, value); + } Review Comment: We need an overload with the `flow_file` parameter that only needs the `property_name` as that is the only one available and this version was already available as a private function. It is probably better to create a new `Property` constructor overload with these parameters, wrap these and pass it to the already available `getProperty` overload. ## libminifi/include/core/ConfigurableComponent.h: ## @@ -212,6 +212,8 @@ class ConfigurableComponent { // Dynamic properties std::map dynamic_properties_; + virtual Property* findProperty(const std::string& name) const; + Review Comment: In the python processors' properties are stored separately in the `python_properties_` vector. Because of this we need to override how to search for a property in getProperty and setProperty functions in the `ExecutePythonProcessor` class. Overriding this `findProperty` provides this, all other `getProperty` methods are either template functions or have an output reference parameter, which we cannot use for this. ## extensions/python/tests/PythonManifestTests.cpp: ## @@ -101,7 +101,7 @@ TEST_CASE("Python processor's description is part of the manifest") { REQUIRE(getNode(MyPyProc->children, "type").value == "org.apache.nifi.minifi.processors.MyPyProc"); auto& rels = getNode(MyPyProc->children, "supportedRelationships").children; -REQUIRE(rels.size() == 2); +REQUIRE(rels.size() == 3); Review Comment: I added the "original" default relationship to the python processors as the NiFi python processors have it. ## extensions/python/PythonProcessor.h: ## @@ -37,7 +38,7 @@ class PythonProcessor { void setDescription(const std::string& desc); - void addProperty(const std::string& name, const std::string& description, const std::string& defaultvalue, bool required, bool el); + void addProperty(const std::string& name, const std::string& description, const std::optional& defaultvalue, bool required, bool el, const std::optional& validator_value); Review Comment: I added a mapping of the NiFi validators to corresponding MiNiFi Property types and represented them with an int value, with the options listed in the PropertyTypeCode enum. On the python side it's called a validator, but maybe it would be better to change it property_type_code here and there as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
szaszm commented on code in PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1459074581 ## libminifi/include/core/ProcessContext.h: ## @@ -135,6 +135,10 @@ class ProcessContext : public controller::ControllerServiceLookup, public core:: return getProperty(property.name, value); } + virtual bool getProperty(bool /*supports_expression_language*/, std::string_view property_name, std::string& value, const std::shared_ptr& /*flow_file*/) { +return getProperty(property_name, value); + } Review Comment: Why do we need a new virtual overload? I'd prefer a single virtual function as an customization point, and possibly other non-static overloads to handle different use cases, to keep things simple. ## docker/test/integration/features/environment.py: ## @@ -47,6 +47,13 @@ def before_scenario(context, scenario): logging.info("Integration test setup at {time:%H:%M:%S.%f}".format(time=datetime.datetime.now())) context.test = MiNiFi_integration_test(context=context, feature_id=context.feature_id) + +if "USE_NIFI_PYTHON_PROCESSORS" in scenario.effective_tags: +if not context.image_store.is_conda_available_in_minifi_image() and context.image_store.get_minifi_image_python_version() < (3, 8, 1): +scenario.skip("NiFi Python processor tests use langchain library therefore Python 3.8.1 or later is needed in the MiNiFi container.") Review Comment: I'd rephrase this. ```suggestion scenario.skip("NiFi Python processor tests use langchain library which requires Python 3.8.1 or later.") ``` ## libminifi/include/core/ConfigurableComponent.h: ## @@ -212,6 +212,8 @@ class ConfigurableComponent { // Dynamic properties std::map dynamic_properties_; + virtual Property* findProperty(const std::string& name) const; + Review Comment: Doesn't getProperty already handle this use case? ## extensions/python/PythonProcessor.h: ## @@ -37,7 +38,7 @@ class PythonProcessor { void setDescription(const std::string& desc); - void addProperty(const std::string& name, const std::string& description, const std::string& defaultvalue, bool required, bool el); + void addProperty(const std::string& name, const std::string& description, const std::optional& defaultvalue, bool required, bool el, const std::optional& validator_value); Review Comment: What's a validator value? ## extensions/python/tests/PythonManifestTests.cpp: ## @@ -101,7 +101,7 @@ TEST_CASE("Python processor's description is part of the manifest") { REQUIRE(getNode(MyPyProc->children, "type").value == "org.apache.nifi.minifi.processors.MyPyProc"); auto& rels = getNode(MyPyProc->children, "supportedRelationships").children; -REQUIRE(rels.size() == 2); +REQUIRE(rels.size() == 3); Review Comment: What changed here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez commented on PR #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#issuecomment-1900083750 I think we should wait for the upcoming NiFi 2.0-M2 release with this PR as that release comes with some changes to the python processors: https://github.com/apache/nifi/pull/8253 Although we do not have support for use cases in the MiNiFi C++ manifest, we should adjust the NiFi python processor support to not break with these added elements. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]
lordgamez opened a new pull request, #1712: URL: https://github.com/apache/nifi-minifi-cpp/pull/1712 https://issues.apache.org/jira/browse/MINIFICPP-2276 - Thank you for submitting a contribution to Apache NiFi - MiNiFi C++. In order to streamline the review of the contribution we ask you to ensure the following steps have been taken: ### For all changes: - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message? - [ ] Does your PR title start with MINIFICPP- where is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character. - [ ] Has your PR been rebased against the latest commit within the target branch (typically main)? - [ ] Is your initial contribution a single, squashed commit? ### For code changes: - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the LICENSE file? - [ ] If applicable, have you updated the NOTICE file? ### For documentation related changes: - [ ] Have you ensured that format looks appropriate for the output in which it is rendered? ### Note: Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org