Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-29 Thread via GitHub


fgerlits closed pull request #1712: MINIFICPP-2276 Support FlowFileTransform 
NiFi Python processors
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-29 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1507614295


##
extensions/python/ExecutePythonProcessor.cpp:
##
@@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() {
 
   try {
 loadScript();
-  } catch(const std::runtime_error&) {
-logger_->log_warn("Could not load python script while initializing. In 
case of non-native python processor this is normal and will be done in the 
schedule phase.");
+  } catch(const std::runtime_error& err) {

Review Comment:
   Updated in 6d382e80a43d2c4de48091dff2e91e7b951edfb1



##
extensions/python/PythonScriptEngine.cpp:
##
@@ -180,4 +198,66 @@ void PythonScriptEngine::evaluateModuleImports() {
   }
 }
 
+void PythonScriptEngine::initializeProcessorObject(const std::string& 
python_class_name) {
+  GlobalInterpreterLock gil;
+  if (auto python_class = bindings_[python_class_name]) {
+auto num_args = [&]() -> size_t {
+  auto class_init = 
OwnedObject(PyObject_GetAttrString(python_class->get(), "__init__"));
+  if (!class_init.get()) {
+return 0;
+  }
+
+  auto inspect_module = OwnedObject(PyImport_ImportModule("inspect"));
+  if (!inspect_module.get()) {
+return 0;
+  }
+
+  auto inspect_args = 
OwnedObject(PyObject_CallMethod(inspect_module.get(), "getfullargspec", "O", 
class_init.get()));
+  if (!inspect_args.get()) {
+return 0;
+  }
+
+  auto arg_list = OwnedObject(PyObject_GetAttrString(inspect_args.get(), 
"args"));
+  if (!arg_list.get()) {
+return 0;
+  }
+
+  return PyList_Size(arg_list.get());
+}();
+
+if (num_args > 1) {
+  auto kwargs = OwnedDict::create();
+  auto value = OwnedObject(Py_None);
+  kwargs.put("jvm", value);
+  auto args = OwnedObject(PyTuple_New(0));
+  processor_instance_ = OwnedObject(PyObject_Call(python_class->get(), 
args.get(), kwargs.get()));
+} else {
+  processor_instance_ = 
OwnedObject(PyObject_CallObject(python_class->get(), nullptr));
+}
+
+if (processor_instance_.get() == nullptr) {
+  throw PythonScriptException(PyException().what());
+}
+
+auto result = PyObject_SetAttrString(processor_instance_.get(), "logger", 
bindings_[std::string("log")]->get());

Review Comment:
   Good point about the interface, it should definitely be changed. Currently 
the issue is due to this ambiguity build error:

   ```
   
/home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.cpp:242:88:
 error: ambiguous overload for ‘operator[]’ (operand types are 
‘org::apache::nifi::minifi::extensions::python::OwnedDict’ {aka 
‘org::apache::nifi::minifi::extensions::python::Dict’}
 and ‘const char [4]’)
 242 | auto result = PyObject_SetAttrString(processor_instance_.get(), 
"logger", bindings_["log"]->get());
 |  
  ^
   
/home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.cpp:242:88:
 note: candidate: ‘operator[](long int, const char*)’ (built-in)
   In file included from 
/home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonBindings.h:24,
from 
/home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.h:20,
from 
/home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.cpp:21:
   
/home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/types/Types.h:290:36:
 note: candidate: 
‘std::optional
 > 
org::apache::nifi::minifi::extensions::python::Dict::operator[](std::string_view)
 [with org::apache::nifi::minifi::extensions::python::ReferenceType 
reference_type = 
org::apache::nifi::minifi::extensions::python::ReferenceType::OWNED; 
std::string_view = std::basic_string_view]’
 290 |   std::optional operator[](std::string_view key) {
 |^~~~
   
/home/ggyimesi/projects/nifi-minifi-cpp-fork/extensions/python/PythonScriptEngine.cpp:246:88:
 error: ambiguous overload for ‘operator[]’ (operand types are 
‘org::apache::nifi::minifi::extensions::python::OwnedDict’ {aka 
‘org::apache::nifi::minifi::extensions::python::Dict’}
 and ‘const char [12]’)
   ```
   
   It doesn't need to be `std::string` but it still needs to explicitly define 
the parameter type due to this. I changed it to std::string_view which is the 
expected type currently in 6d382e80a43d2c4de48091dff2e91e7b951edfb1



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-29 Thread via GitHub


szaszm commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1505967478


##
extensions/python/ExecutePythonProcessor.cpp:
##
@@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() {
 
   try {
 loadScript();
-  } catch(const std::runtime_error&) {
-logger_->log_warn("Could not load python script while initializing. In 
case of non-native python processor this is normal and will be done in the 
schedule phase.");
+  } catch(const std::runtime_error& err) {

Review Comment:
   I'd keep this unnamed as long as it's not referenced.
   ```suggestion
 } catch(const std::runtime_error&) {
   ```



##
extensions/python/PythonScriptEngine.cpp:
##
@@ -180,4 +198,66 @@ void PythonScriptEngine::evaluateModuleImports() {
   }
 }
 
+void PythonScriptEngine::initializeProcessorObject(const std::string& 
python_class_name) {
+  GlobalInterpreterLock gil;
+  if (auto python_class = bindings_[python_class_name]) {
+auto num_args = [&]() -> size_t {
+  auto class_init = 
OwnedObject(PyObject_GetAttrString(python_class->get(), "__init__"));
+  if (!class_init.get()) {
+return 0;
+  }
+
+  auto inspect_module = OwnedObject(PyImport_ImportModule("inspect"));
+  if (!inspect_module.get()) {
+return 0;
+  }
+
+  auto inspect_args = 
OwnedObject(PyObject_CallMethod(inspect_module.get(), "getfullargspec", "O", 
class_init.get()));
+  if (!inspect_args.get()) {
+return 0;
+  }
+
+  auto arg_list = OwnedObject(PyObject_GetAttrString(inspect_args.get(), 
"args"));
+  if (!arg_list.get()) {
+return 0;
+  }
+
+  return PyList_Size(arg_list.get());
+}();
+
+if (num_args > 1) {
+  auto kwargs = OwnedDict::create();
+  auto value = OwnedObject(Py_None);
+  kwargs.put("jvm", value);
+  auto args = OwnedObject(PyTuple_New(0));
+  processor_instance_ = OwnedObject(PyObject_Call(python_class->get(), 
args.get(), kwargs.get()));
+} else {
+  processor_instance_ = 
OwnedObject(PyObject_CallObject(python_class->get(), nullptr));
+}
+
+if (processor_instance_.get() == nullptr) {
+  throw PythonScriptException(PyException().what());
+}
+
+auto result = PyObject_SetAttrString(processor_instance_.get(), "logger", 
bindings_[std::string("log")]->get());

Review Comment:
   Why are the keys wrapped in `std::string`? String literals are `const char*` 
and are already null-terminated, so if that's the only reason, I think they 
should be just raw literals.
   
   The interface is confusing and dangerous, taking `std::string_view` when it 
needs a null-terminated string internally, and we should change that in my 
opinion, but not in this PR. Created Jira for this, but let me know if I'm 
missing something about this: 
[MINIFICPP-2308](https://issues.apache.org/jira/browse/MINIFICPP-2308)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500838899


##
extensions/python/ExecutePythonProcessor.h:
##
@@ -97,10 +99,8 @@ class ExecutePythonProcessor : public core::Processor {
 python_dynamic_ = true;
   }
 
-  void addProperty(const std::string , const std::string , 
const std::string , bool required, bool el) {
-python_properties_.emplace_back(
-
core::PropertyDefinitionBuilder<>::createProperty(name).withDefaultValue(defaultvalue).withDescription(description).isRequired(required).supportsExpressionLanguage(el).build());
-  }
+  void addProperty(const std::string , const std::string , 
const std::optional , bool required, bool el,
+  bool sensitive, const std::optional& property_type_code);
 
   const std::vector () const {
 return python_properties_;

Review Comment:
   Good point, updated in d3f47c57d21829d1715bd794f8c08a462f4fa361



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


fgerlits commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500803117


##
extensions/python/ExecutePythonProcessor.h:
##
@@ -97,10 +99,8 @@ class ExecutePythonProcessor : public core::Processor {
 python_dynamic_ = true;
   }
 
-  void addProperty(const std::string , const std::string , 
const std::string , bool required, bool el) {
-python_properties_.emplace_back(
-
core::PropertyDefinitionBuilder<>::createProperty(name).withDefaultValue(defaultvalue).withDescription(description).isRequired(required).supportsExpressionLanguage(el).build());
-  }
+  void addProperty(const std::string , const std::string , 
const std::optional , bool required, bool el,
+  bool sensitive, const std::optional& property_type_code);
 
   const std::vector () const {
 return python_properties_;

Review Comment:
   I think we should acquire the mutex here, too, and return a copy.  The copy 
will happen anyway at the calling site in 
`PythonCreator::registerScriptDescription`; locking the mutex adds some 
overhead, but it is necessary.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500753007


##
extensions/python/ExecutePythonProcessor.cpp:
##
@@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() {
 
   try {
 loadScript();
-  } catch(const std::runtime_error&) {
-logger_->log_warn("Could not load python script while initializing. In 
case of non-native python processor this is normal and will be done in the 
schedule phase.");

Review Comment:
   Sure, added jira ticket: https://issues.apache.org/jira/browse/MINIFICPP-2304



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


fgerlits commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500714476


##
extensions/python/ExecutePythonProcessor.cpp:
##
@@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() {
 
   try {
 loadScript();
-  } catch(const std::runtime_error&) {
-logger_->log_warn("Could not load python script while initializing. In 
case of non-native python processor this is normal and will be done in the 
schedule phase.");

Review Comment:
   I see, thanks! Can you create a Jira for clearing this up later?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500709305


##
extensions/python/types/Types.h:
##
@@ -160,7 +160,18 @@ class Long : public ReferenceHolder {
   }
 
   int64_t asInt64() {
-return static_cast(PyLong_AsLongLong(this->ref_.get()));
+auto long_value = PyLong_AsLongLong(this->ref_.get());
+if (long_value == -1 && PyErr_Occurred()) {
+  throw PyException();
+}
+return static_cast(long_value);

Review Comment:
   Updated in 6593d9b63514dd2a3b2060b747680f37691abecd



##
libminifi/src/core/ConfigurableComponent.cpp:
##
@@ -36,19 +36,28 @@ ConfigurableComponent::ConfigurableComponent()
 
 ConfigurableComponent::~ConfigurableComponent() = default;
 
+Property* ConfigurableComponent::findProperty(const std::string& name) const {

Review Comment:
   Updated in 6593d9b63514dd2a3b2060b747680f37691abecd



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500708936


##
extensions/python/ExecutePythonProcessor.cpp:
##
@@ -146,11 +151,38 @@ std::unique_ptr 
ExecutePythonProcessor::createScriptEngine()
   auto engine = std::make_unique();
 
   python_logger_ = 
core::logging::LoggerFactory::getAliasedLogger(getName());
-  engine->initialize(Success, Failure, python_logger_);
+  engine->initialize(Success, Failure, Original, python_logger_);
 
   return engine;
 }
 
+core::Property* ExecutePythonProcessor::findProperty(const std::string& name) 
const {
+  if (auto prop_ptr = core::ConfigurableComponent::findProperty(name)) {
+return prop_ptr;
+  }
+
+  auto it = ranges::find_if(python_properties_, [](const auto& item){
+return item.getName() == name;
+  });
+  if (it != python_properties_.end()) {
+return const_cast(&*it);
+  }
+
+  return nullptr;
+}
+
+std::map ExecutePythonProcessor::getProperties() 
const {
+  auto result = ConfigurableComponent::getProperties();
+
+  std::lock_guard lock(configuration_mutex_);

Review Comment:
   You are right, it should be a separate mutex for the `python_properties_`, 
it didn't really make sense this way. Updated in 
6593d9b63514dd2a3b2060b747680f37691abecd



##
extensions/python/PYTHON.md:
##
@@ -106,20 +127,39 @@ class VaderSentiment(object):
 To enable python Processor capabilities, the following options need to be 
provided in minifi.properties. The directory specified
 can contain processors. Note that the processor name will be the reference in 
your flow. Directories are treated like package names.
 Therefore if the nifi.python.processor.dir is /tmp/ and you have a 
subdirectory named packagedir with the file name file.py, it will
-produce a processor with the name 
org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory 
will append a package 
-to the reference class name. 
+produce a processor with the name 
org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory 
will append a package
+to the reference class name.
 
 in minifi.properties
#directory where processors exist
nifi.python.processor.dir=
-   
-   
+
+
 ## Processors
 The python directory (extensions/pythonprocessors) contains implementations 
that will be available for flows if the required dependencies
 exist.
-   
-## Sentiment Analysis
+
+### Sentiment Analysis
 
 The SentimentAnalysis processor will perform a Vader Sentiment Analysis. This 
requires that you install nltk and VaderSentiment
pip install nltk
pip install VaderSentiment
+
+## Using NiFi Python Processors
+
+MiNiFi C++ supports the use of NiFi Python processors, that are inherited from 
the FlowFileTransform base class. To use these processors, you must copy the 
Python processor module under the nifi_python_processors directory located 
under the python directory (by default it can be found at 
${minifi_root}/minifi-python/nifi_python_processors). To see how to write NiFi 
Python processors, please refer to the Python Developer Guide under the [Apache 
NiFi documentation](https://nifi.apache.org/documentation/v2/).
+
+In the flow configuration these Python processors can be referenced by their 
fully qualified class name, which looks like this: 
org.apache.nifi.minifi.processors.nifi_python_processors...
 For example, the fully qualified class name of the PromptChatGPT processor 
implemented in the file nifi_python_processors/PromptChatGPT.py is 
org.apache.nifi.minifi.processors.nifi_python_processors.PromptChatGPT. If a 
processor is copied under a subdirectory, because it is part of a python 
submodule, the submodule name will be appended to the fully qualified class 
name. For example, if the QueryPinecone processor is implemented in the 
QueryPinecone.py file that is copied to 
nifi_python_processors/vectorstores/QueryPinecone.py, the fully qualified class 
name will be 
org.apache.nifi.minifi.processors.nifi_python_processors.vectorstores.QueryPinecone
 in the configuration file.
+
+**NOTE:** The name of the NiFi Python processor file should match the class 
name in the file, otherwise the processor will not be found.
+
+Due to some differences between the NiFi and MiNiFi C++ processors and 
implementation, there are some limitations using the NiFi Python processors:
+- Record based processors are not yet supported in MiNiFi C++, so the NiFi 
Python processors inherited from RecordTransform are not supported.
+- Virtualenv support is not yet available in MiNiFi C++, so all required 
packaged must be installed on the system.

Review Comment:
   Updated in 6593d9b63514dd2a3b2060b747680f37691abecd



##
extensions/python/PYTHON.md:
##
@@ -106,20 +127,39 @@ class VaderSentiment(object):
 To enable python Processor capabilities, the following options need to be 
provided in minifi.properties. The 

Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500664052


##
extensions/python/types/PyProcessContext.cpp:
##
@@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* 
self, PyObject* args)
 return nullptr;
   }
 
-  const char* property;
-  if (!PyArg_ParseTuple(args, "s", )) {
+  const char* property_name = nullptr;
+  PyObject* script_flow_file = nullptr;
+  if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) {
 throw PyException();
   }
+
   std::string value;
-  context->getProperty(property, value);
+  if (!script_flow_file) {
+if (!context->getProperty(property_name, value)) {
+  Py_RETURN_NONE;
+}
+  } else {
+auto py_flow = reinterpret_cast(script_flow_file);
+const auto flow_file = py_flow->script_flow_file_.lock();
+if (!flow_file) {
+  PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 
'on_trigger'");
+  return nullptr;

Review Comment:
   Py_RETURN_NONE returns a Python object with the Python `None` value which 
can be a valid return value in some cases, while the returning `nullptr` 
indicates an error, and python throws an exception if nullptr is returned from 
the C API with the error set in the `PyErr_SetString`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500637729


##
extensions/python/PythonScriptEngine.cpp:
##
@@ -180,4 +198,66 @@ void PythonScriptEngine::evaluateModuleImports() {
   }
 }
 
+void PythonScriptEngine::initializeProcessorObject(const std::string& 
python_class_name) {
+  GlobalInterpreterLock gil;
+  if (auto python_class = bindings_[python_class_name]) {
+auto num_args = [&]() -> size_t {
+  auto class_init = 
OwnedObject(PyObject_GetAttrString(python_class->get(), "__init__"));
+  if (!class_init.get()) {
+return 0;
+  }
+
+  auto inspect_module = OwnedObject(PyImport_ImportModule("inspect"));
+  if (!inspect_module.get()) {
+return 0;
+  }
+
+  auto inspect_args = 
OwnedObject(PyObject_CallMethod(inspect_module.get(), "getfullargspec", "O", 
class_init.get()));
+  if (!inspect_args.get()) {
+return 0;
+  }
+
+  auto arg_list = OwnedObject(PyObject_GetAttrString(inspect_args.get(), 
"args"));
+  if (!arg_list.get()) {
+return 0;
+  }
+
+  return PyList_Size(arg_list.get());
+}();
+
+if (num_args > 1) {
+  auto kwargs = OwnedDict::create();
+  auto value = OwnedObject(Py_None);
+  kwargs.put("jvm", value);
+  auto args = OwnedObject(PyTuple_New(0));

Review Comment:
   In NiFi there are scenarios when the `jvm` object is passed to the 
constructor, either as an arg or as a named kwarg and we only expect this one 
optional argument. We only want to pass this as part of the kwargs argument of 
the constructor call of the Python processor object, but before that we need to 
pass the positional args as well. As we do not want to pass any positional 
args, the python C API expects us to pass a zero length tuple which is a 
`PyTuple_New(0)` object to the `PyObject_Call` function in this case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-23 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1500602870


##
extensions/python/ExecutePythonProcessor.cpp:
##
@@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() {
 
   try {
 loadScript();
-  } catch(const std::runtime_error&) {
-logger_->log_warn("Could not load python script while initializing. In 
case of non-native python processor this is normal and will be done in the 
schedule phase.");

Review Comment:
   Unfortunately the python processors are instantiated twice. We have to 
instantiate the processors defined in the flow, but also we instantiate all 
python processors when building the manifest to get the description and the 
processor properties. In the latter scenario the initialize is called twice, 
once before setting the script file in the `PythonObjectFactory.h` (to 
initialize the supported properties), and in that case the loadScript() will 
always fail, because the ScriptFile property value is not set yet. As this will 
always happen the log is useless here. This double initialization is a bit 
wonky and should be worked out sometime in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-22 Thread via GitHub


fgerlits commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1497594578


##
extensions/python/PYTHON.md:
##
@@ -106,20 +127,39 @@ class VaderSentiment(object):
 To enable python Processor capabilities, the following options need to be 
provided in minifi.properties. The directory specified
 can contain processors. Note that the processor name will be the reference in 
your flow. Directories are treated like package names.
 Therefore if the nifi.python.processor.dir is /tmp/ and you have a 
subdirectory named packagedir with the file name file.py, it will
-produce a processor with the name 
org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory 
will append a package 
-to the reference class name. 
+produce a processor with the name 
org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory 
will append a package
+to the reference class name.
 
 in minifi.properties
#directory where processors exist
nifi.python.processor.dir=
-   
-   
+
+
 ## Processors
 The python directory (extensions/pythonprocessors) contains implementations 
that will be available for flows if the required dependencies
 exist.
-   
-## Sentiment Analysis
+
+### Sentiment Analysis
 
 The SentimentAnalysis processor will perform a Vader Sentiment Analysis. This 
requires that you install nltk and VaderSentiment
pip install nltk
pip install VaderSentiment
+
+## Using NiFi Python Processors
+
+MiNiFi C++ supports the use of NiFi Python processors, that are inherited from 
the FlowFileTransform base class. To use these processors, you must copy the 
Python processor module under the nifi_python_processors directory located 
under the python directory (by default it can be found at 
${minifi_root}/minifi-python/nifi_python_processors). To see how to write NiFi 
Python processors, please refer to the Python Developer Guide under the [Apache 
NiFi documentation](https://nifi.apache.org/documentation/v2/).
+
+In the flow configuration these Python processors can be referenced by their 
fully qualified class name, which looks like this: 
org.apache.nifi.minifi.processors.nifi_python_processors...
 For example, the fully qualified class name of the PromptChatGPT processor 
implemented in the file nifi_python_processors/PromptChatGPT.py is 
org.apache.nifi.minifi.processors.nifi_python_processors.PromptChatGPT. If a 
processor is copied under a subdirectory, because it is part of a python 
submodule, the submodule name will be appended to the fully qualified class 
name. For example, if the QueryPinecone processor is implemented in the 
QueryPinecone.py file that is copied to 
nifi_python_processors/vectorstores/QueryPinecone.py, the fully qualified class 
name will be 
org.apache.nifi.minifi.processors.nifi_python_processors.vectorstores.QueryPinecone
 in the configuration file.
+
+**NOTE:** The name of the NiFi Python processor file should match the class 
name in the file, otherwise the processor will not be found.
+
+Due to some differences between the NiFi and MiNiFi C++ processors and 
implementation, there are some limitations using the NiFi Python processors:
+- Record based processors are not yet supported in MiNiFi C++, so the NiFi 
Python processors inherited from RecordTransform are not supported.
+- Virtualenv support is not yet available in MiNiFi C++, so all required 
packaged must be installed on the system.

Review Comment:
   ```suggestion
   - Virtualenv support is not yet available in MiNiFi C++, so all required 
packages must be installed on the system.
   ```



##
extensions/python/ExecutePythonProcessor.cpp:
##
@@ -44,8 +45,7 @@ void ExecutePythonProcessor::initialize() {
 
   try {
 loadScript();
-  } catch(const std::runtime_error&) {
-logger_->log_warn("Could not load python script while initializing. In 
case of non-native python processor this is normal and will be done in the 
schedule phase.");

Review Comment:
   Is this log not useful at all?  Maybe we could change it to a lower level 
(info or debug) instead of removing it.



##
extensions/python/types/Types.h:
##
@@ -160,7 +160,18 @@ class Long : public ReferenceHolder {
   }
 
   int64_t asInt64() {
-return static_cast(PyLong_AsLongLong(this->ref_.get()));
+auto long_value = PyLong_AsLongLong(this->ref_.get());
+if (long_value == -1 && PyErr_Occurred()) {
+  throw PyException();
+}
+return static_cast(long_value);

Review Comment:
   I may be overly paranoid, but I'd prefer `gsl::narrow` instead of 
`static_cast`.



##
extensions/python/PYTHON.md:
##
@@ -106,20 +127,39 @@ class VaderSentiment(object):
 To enable python Processor capabilities, the following options need to be 
provided in minifi.properties. The directory specified
 can contain processors. Note that the processor name will be the reference in 

Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-12 Thread via GitHub


szaszm commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1486436629


##
extensions/python/types/PyProcessContext.cpp:
##
@@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* 
self, PyObject* args)
 return nullptr;
   }
 
-  const char* property;
-  if (!PyArg_ParseTuple(args, "s", )) {
+  const char* property_name = nullptr;
+  PyObject* script_flow_file = nullptr;
+  if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) {
 throw PyException();
   }
+
   std::string value;
-  context->getProperty(property, value);
+  if (!script_flow_file) {
+if (!context->getProperty(property_name, value)) {
+  Py_RETURN_NONE;
+}
+  } else {
+auto py_flow = reinterpret_cast(script_flow_file);
+const auto flow_file = py_flow->script_flow_file_.lock();
+if (!flow_file) {
+  PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 
'on_trigger'");
+  return nullptr;
+}

Review Comment:
   No, I was only thinking about raw non-owning pointers instead of weak_ptr, 
but it looks like that's not feasible for now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-08 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1483114670


##
extensions/python/types/PyProcessContext.cpp:
##
@@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* 
self, PyObject* args)
 return nullptr;
   }
 
-  const char* property;
-  if (!PyArg_ParseTuple(args, "s", )) {
+  const char* property_name = nullptr;
+  PyObject* script_flow_file = nullptr;
+  if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) {
 throw PyException();
   }
+
   std::string value;
-  context->getProperty(property, value);
+  if (!script_flow_file) {
+if (!context->getProperty(property_name, value)) {
+  Py_RETURN_NONE;
+}
+  } else {
+auto py_flow = reinterpret_cast(script_flow_file);
+const auto flow_file = py_flow->script_flow_file_.lock();
+if (!flow_file) {
+  PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 
'on_trigger'");
+  return nullptr;
+}

Review Comment:
   Currently the held type of is `std::weak_ptr` in both 
`PyScriptFlowFile` and `PyScriptFlowFile`, where this is the way the held 
objects are handled in both files. We can refactor this in the future, but for 
the time being I think we should stick with using the same handling here as 
well, or is there anything else you meant by leaving the door open?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-08 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1483114670


##
extensions/python/types/PyProcessContext.cpp:
##
@@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* 
self, PyObject* args)
 return nullptr;
   }
 
-  const char* property;
-  if (!PyArg_ParseTuple(args, "s", )) {
+  const char* property_name = nullptr;
+  PyObject* script_flow_file = nullptr;
+  if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) {
 throw PyException();
   }
+
   std::string value;
-  context->getProperty(property, value);
+  if (!script_flow_file) {
+if (!context->getProperty(property_name, value)) {
+  Py_RETURN_NONE;
+}
+  } else {
+auto py_flow = reinterpret_cast(script_flow_file);
+const auto flow_file = py_flow->script_flow_file_.lock();
+if (!flow_file) {
+  PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 
'on_trigger'");
+  return nullptr;
+}

Review Comment:
   Currently the held type of is `std::weak_ptr` in both 
`PyScriptFlowFile` and `PyScriptFlowFile`, where this is the way the held 
objects are handled in both files. We can refactor this in the future, but for 
the time being I think we should stick with using the same handling here as 
well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-08 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1483106461


##
docker/test/integration/resources/minifi-c2-server/config-ssl.yml:
##
@@ -0,0 +1,43 @@
+MiNiFi Config Version: 3
+Flow Controller:
+  name: MiNiFi Flow
+Processors:
+- name: Get files from /tmp/input
+  id: 2f2a3b47-f5ba-49f6-82b5-bc1c86b96e27
+  class: org.apache.nifi.minifi.processors.GetFile
+  scheduling strategy: TIMER_DRIVEN
+  scheduling period: 1000 ms
+  Properties:
+Input Directory: /tmp/input
+- name: Put files to /tmp/output
+  id: e143601d-de4f-44ba-a6ec-d1f97d77ec94
+  class: org.apache.nifi.minifi.processors.PutFile
+  scheduling strategy: EVENT_DRIVEN
+  auto-terminated relationships list:
+  - failure
+  - success
+  Properties:
+Conflict Resolution Strategy: fail
+Create Missing Directories: 'true'
+Directory: /tmp/output
+Connections:
+- name: GetFile/success/PutFile
+  id: 098a56ba-f4bf-4323-a3f3-6f8a5e3586bf
+  source id: 2f2a3b47-f5ba-49f6-82b5-bc1c86b96e27
+  source relationship names:
+  - success
+  destination id: e143601d-de4f-44ba-a6ec-d1f97d77ec94
+Controller Services:
+  - name: SSLContextService
+id: 2438e3c8-015a-1000-79ca-83af40ec1994
+class: SSLContextService

Review Comment:
   This is only used for C2, and this is only the config that is sent from the 
C2 server after the update. I actually just reverted the json files to the 
previously used yamls files for these tests and didn't check them. There are 
separate tests for using the controller service and using the ssl properties in 
the minifi.properties file. But we can actually remove this file, as we only 
check if the flow update succeeds and we don't care about the C2 communication 
afterwards so we can use the same config without the SSL context. Updated in 
3d207dca7c2cdf1f86f4b6308a8c02addb2a852d



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-08 Thread via GitHub


szaszm commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1479706920


##
docker/test/integration/resources/minifi-c2-server/config-ssl.yml:
##
@@ -0,0 +1,43 @@
+MiNiFi Config Version: 3
+Flow Controller:
+  name: MiNiFi Flow
+Processors:
+- name: Get files from /tmp/input
+  id: 2f2a3b47-f5ba-49f6-82b5-bc1c86b96e27
+  class: org.apache.nifi.minifi.processors.GetFile
+  scheduling strategy: TIMER_DRIVEN
+  scheduling period: 1000 ms
+  Properties:
+Input Directory: /tmp/input
+- name: Put files to /tmp/output
+  id: e143601d-de4f-44ba-a6ec-d1f97d77ec94
+  class: org.apache.nifi.minifi.processors.PutFile
+  scheduling strategy: EVENT_DRIVEN
+  auto-terminated relationships list:
+  - failure
+  - success
+  Properties:
+Conflict Resolution Strategy: fail
+Create Missing Directories: 'true'
+Directory: /tmp/output
+Connections:
+- name: GetFile/success/PutFile
+  id: 098a56ba-f4bf-4323-a3f3-6f8a5e3586bf
+  source id: 2f2a3b47-f5ba-49f6-82b5-bc1c86b96e27
+  source relationship names:
+  - success
+  destination id: e143601d-de4f-44ba-a6ec-d1f97d77ec94
+Controller Services:
+  - name: SSLContextService
+id: 2438e3c8-015a-1000-79ca-83af40ec1994
+class: SSLContextService

Review Comment:
   Is this used for C2? Because the flow doesn't use it. If it's C2, I think it 
would be nicer to include a reference to C2 in its name, or configure SSL in 
the config properties in minifi.properties.



##
extensions/python/types/PyProcessContext.cpp:
##
@@ -65,12 +66,31 @@ PyObject* PyProcessContext::getProperty(PyProcessContext* 
self, PyObject* args)
 return nullptr;
   }
 
-  const char* property;
-  if (!PyArg_ParseTuple(args, "s", )) {
+  const char* property_name = nullptr;
+  PyObject* script_flow_file = nullptr;
+  if (!PyArg_ParseTuple(args, "s|O", _name, _flow_file)) {
 throw PyException();
   }
+
   std::string value;
-  context->getProperty(property, value);
+  if (!script_flow_file) {
+if (!context->getProperty(property_name, value)) {
+  Py_RETURN_NONE;
+}
+  } else {
+auto py_flow = reinterpret_cast(script_flow_file);
+const auto flow_file = py_flow->script_flow_file_.lock();
+if (!flow_file) {
+  PyErr_SetString(PyExc_AttributeError, "tried reading FlowFile outside 
'on_trigger'");
+  return nullptr;
+}

Review Comment:
   We should leave the door open for a future refactoring of flow file 
ownership, where onTrigger calls have non-owning references to flow files, 
without shared_ptr usage.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-08 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1482851080


##
extensions/python/PYTHON.md:
##
@@ -106,20 +127,39 @@ class VaderSentiment(object):
 To enable python Processor capabilities, the following options need to be 
provided in minifi.properties. The directory specified
 can contain processors. Note that the processor name will be the reference in 
your flow. Directories are treated like package names.
 Therefore if the nifi.python.processor.dir is /tmp/ and you have a 
subdirectory named packagedir with the file name file.py, it will
-produce a processor with the name 
org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory 
will append a package 
-to the reference class name. 
+produce a processor with the name 
org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory 
will append a package
+to the reference class name.
 
 in minifi.properties
#directory where processors exist
nifi.python.processor.dir=
-   
-   
+
+
 ## Processors
 The python directory (extensions/pythonprocessors) contains implementations 
that will be available for flows if the required dependencies
 exist.
-   
-## Sentiment Analysis
+
+### Sentiment Analysis
 
 The SentimentAnalysis processor will perform a Vader Sentiment Analysis. This 
requires that you install nltk and VaderSentiment
pip install nltk
pip install VaderSentiment
+
+## Using NiFi Python Processors
+
+MiNiFi C++ supports the use of NiFi Python processors, that are inherited from 
the FlowFileTransform base class. To use these processors, you must copy the 
Python processor module under the nifi_python_processors directory located 
under the python directory (by default it can be found at 
${minifi_root}/minifi-python/nifi_python_processors). To see how to to write 
NiFi Python processors, please refer to the Python Developer Guide under the 
[Apache NiFi documentation](https://nifi.apache.org/documentation/v2/).

Review Comment:
   Fixed in fc069e05f898ca1b70414d26b325b3e6af1456c0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-02-07 Thread via GitHub


adamdebreceni commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1481497293


##
extensions/python/PYTHON.md:
##
@@ -106,20 +127,39 @@ class VaderSentiment(object):
 To enable python Processor capabilities, the following options need to be 
provided in minifi.properties. The directory specified
 can contain processors. Note that the processor name will be the reference in 
your flow. Directories are treated like package names.
 Therefore if the nifi.python.processor.dir is /tmp/ and you have a 
subdirectory named packagedir with the file name file.py, it will
-produce a processor with the name 
org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory 
will append a package 
-to the reference class name. 
+produce a processor with the name 
org.apache.nifi.minifi.processors.packagedir.file. Note that each subdirectory 
will append a package
+to the reference class name.
 
 in minifi.properties
#directory where processors exist
nifi.python.processor.dir=
-   
-   
+
+
 ## Processors
 The python directory (extensions/pythonprocessors) contains implementations 
that will be available for flows if the required dependencies
 exist.
-   
-## Sentiment Analysis
+
+### Sentiment Analysis
 
 The SentimentAnalysis processor will perform a Vader Sentiment Analysis. This 
requires that you install nltk and VaderSentiment
pip install nltk
pip install VaderSentiment
+
+## Using NiFi Python Processors
+
+MiNiFi C++ supports the use of NiFi Python processors, that are inherited from 
the FlowFileTransform base class. To use these processors, you must copy the 
Python processor module under the nifi_python_processors directory located 
under the python directory (by default it can be found at 
${minifi_root}/minifi-python/nifi_python_processors). To see how to to write 
NiFi Python processors, please refer to the Python Developer Guide under the 
[Apache NiFi documentation](https://nifi.apache.org/documentation/v2/).

Review Comment:
   "how to to write NiFi"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-26 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1467516253


##
extensions/python/ExecutePythonProcessor.h:
##
@@ -118,7 +126,30 @@ class ExecutePythonProcessor : public core::Processor {
 return description_;
   }
 
+  void setPythonClassName(const std::string& python_class_name) {
+python_class_name_ = python_class_name;
+  }
+
+  void setPythonPaths(const std::vector& python_paths) {
+python_paths_ = python_paths;
+  }
+
+ protected:
+  core::Property* findProperty(const std::string& name) const override;
+
  private:
+  enum class PropertyTypeCode : int64_t {
+INTEGER_TYPE = 0,
+LONG_TYPE = 1,
+BOOLEAN_TYPE = 2,
+DATA_SIZE_TYPE = 3,
+TIME_PERIOD_TYPE = 4,
+NON_BLANK_TYPE = 5,
+PORT_TYPE = 6
+  };

Review Comment:
   Updated in ee23065b91cd8ba88fb7db2dfeaa5c1cb31dae55



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-24 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1465177492


##
extensions/python/ExecutePythonProcessor.h:
##
@@ -118,7 +126,30 @@ class ExecutePythonProcessor : public core::Processor {
 return description_;
   }
 
+  void setPythonClassName(const std::string& python_class_name) {
+python_class_name_ = python_class_name;
+  }
+
+  void setPythonPaths(const std::vector& python_paths) {
+python_paths_ = python_paths;
+  }
+
+ protected:
+  core::Property* findProperty(const std::string& name) const override;
+
  private:
+  enum class PropertyTypeCode : int64_t {
+INTEGER_TYPE = 0,
+LONG_TYPE = 1,
+BOOLEAN_TYPE = 2,
+DATA_SIZE_TYPE = 3,
+TIME_PERIOD_TYPE = 4,
+NON_BLANK_TYPE = 5,
+PORT_TYPE = 6
+  };

Review Comment:
   This is actually a reference to our `StandardPropertyTypes` found in MiNiFi 
defined in the `libminifi/include/core/PropertyType.h` header, but we can 
remove the suffix and I suppose it's better to have it as part of the 
`core::PropertyType`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-24 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1465177492


##
extensions/python/ExecutePythonProcessor.h:
##
@@ -118,7 +126,30 @@ class ExecutePythonProcessor : public core::Processor {
 return description_;
   }
 
+  void setPythonClassName(const std::string& python_class_name) {
+python_class_name_ = python_class_name;
+  }
+
+  void setPythonPaths(const std::vector& python_paths) {
+python_paths_ = python_paths;
+  }
+
+ protected:
+  core::Property* findProperty(const std::string& name) const override;
+
  private:
+  enum class PropertyTypeCode : int64_t {
+INTEGER_TYPE = 0,
+LONG_TYPE = 1,
+BOOLEAN_TYPE = 2,
+DATA_SIZE_TYPE = 3,
+TIME_PERIOD_TYPE = 4,
+NON_BLANK_TYPE = 5,
+PORT_TYPE = 6
+  };

Review Comment:
   This is actually a reference to our `StandardPropertyTypes` found in MiNiFi 
defined in the `libminifi/include/core/PropertyType.h`, but we can remove the 
suffix and I suppose it's better to have it as part of the `core::PropertyType`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-24 Thread via GitHub


szaszm commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1463510235


##
extensions/python/ExecutePythonProcessor.h:
##
@@ -118,7 +126,30 @@ class ExecutePythonProcessor : public core::Processor {
 return description_;
   }
 
+  void setPythonClassName(const std::string& python_class_name) {
+python_class_name_ = python_class_name;
+  }
+
+  void setPythonPaths(const std::vector& python_paths) {
+python_paths_ = python_paths;
+  }
+
+ protected:
+  core::Property* findProperty(const std::string& name) const override;
+
  private:
+  enum class PropertyTypeCode : int64_t {
+INTEGER_TYPE = 0,
+LONG_TYPE = 1,
+BOOLEAN_TYPE = 2,
+DATA_SIZE_TYPE = 3,
+TIME_PERIOD_TYPE = 4,
+NON_BLANK_TYPE = 5,
+PORT_TYPE = 6
+  };

Review Comment:
   Is the naming directly coming from NiFi? If not, I'd drop the `_TYPE` suffix 
of the alternatives. I also think we should consider moving this type and the 
translate function near `core::PropertyType`, since it's related, general, and 
not dependent on anything.
   A reference to NiFi code or docs where these codes are defined would also be 
nice, as a code comment.



##
libminifi/include/core/ConfigurableComponent.h:
##
@@ -212,6 +212,8 @@ class ConfigurableComponent {
   // Dynamic properties
   std::map dynamic_properties_;
 
+  virtual Property* findProperty(const std::string& name) const;
+

Review Comment:
   I was under the impression that `getProperty` was already `virtual`, but 
apparently that's not the case, so this new customization point makes sense 
knowing this. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-23 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1463029693


##
docker/test/integration/features/environment.py:
##
@@ -47,6 +47,13 @@ def before_scenario(context, scenario):
 
 logging.info("Integration test setup at 
{time:%H:%M:%S.%f}".format(time=datetime.datetime.now()))
 context.test = MiNiFi_integration_test(context=context, 
feature_id=context.feature_id)
+
+if "USE_NIFI_PYTHON_PROCESSORS" in scenario.effective_tags:
+if not context.image_store.is_conda_available_in_minifi_image() and 
context.image_store.get_minifi_image_python_version() < (3, 8, 1):
+scenario.skip("NiFi Python processor tests use langchain library 
therefore Python 3.8.1 or later is needed in the MiNiFi container.")

Review Comment:
   Updated in e8757f9c370b2ec92866a3731af1e6d9ae0edb74



##
libminifi/include/core/ProcessContext.h:
##
@@ -135,6 +135,10 @@ class ProcessContext : public 
controller::ControllerServiceLookup, public core::
 return getProperty(property.name, value);
   }
 
+  virtual bool getProperty(bool /*supports_expression_language*/, 
std::string_view property_name, std::string& value, const 
std::shared_ptr& /*flow_file*/) {
+return getProperty(property_name, value);
+  }

Review Comment:
   Updated in e8757f9c370b2ec92866a3731af1e6d9ae0edb74



##
extensions/python/PythonProcessor.h:
##
@@ -37,7 +38,7 @@ class PythonProcessor {
 
   void setDescription(const std::string& desc);
 
-  void addProperty(const std::string& name, const std::string& description, 
const std::string& defaultvalue, bool required, bool el);
+  void addProperty(const std::string& name, const std::string& description, 
const std::optional& defaultvalue, bool required, bool el, const 
std::optional& validator_value);

Review Comment:
   Updated in e8757f9c370b2ec92866a3731af1e6d9ae0edb74



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-19 Thread via GitHub


lordgamez commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1459202019


##
libminifi/include/core/ProcessContext.h:
##
@@ -135,6 +135,10 @@ class ProcessContext : public 
controller::ControllerServiceLookup, public core::
 return getProperty(property.name, value);
   }
 
+  virtual bool getProperty(bool /*supports_expression_language*/, 
std::string_view property_name, std::string& value, const 
std::shared_ptr& /*flow_file*/) {
+return getProperty(property_name, value);
+  }

Review Comment:
   We need an overload  with the `flow_file` parameter that only needs the 
`property_name` as that is the only one available and this version was already 
available as a private function. It is probably better to create a new 
`Property` constructor overload with these parameters, wrap these and pass it 
to the already available `getProperty` overload.



##
libminifi/include/core/ConfigurableComponent.h:
##
@@ -212,6 +212,8 @@ class ConfigurableComponent {
   // Dynamic properties
   std::map dynamic_properties_;
 
+  virtual Property* findProperty(const std::string& name) const;
+

Review Comment:
   In the python processors' properties are stored separately in the 
`python_properties_` vector. Because of this we need to override how to search 
for a property in getProperty and setProperty functions in the 
`ExecutePythonProcessor` class. Overriding this `findProperty` provides this, 
all other `getProperty` methods are either template functions or have an output 
reference parameter, which we cannot use for this.



##
extensions/python/tests/PythonManifestTests.cpp:
##
@@ -101,7 +101,7 @@ TEST_CASE("Python processor's description is part of the 
manifest") {
 REQUIRE(getNode(MyPyProc->children, "type").value == 
"org.apache.nifi.minifi.processors.MyPyProc");
 
 auto& rels = getNode(MyPyProc->children, 
"supportedRelationships").children;
-REQUIRE(rels.size() == 2);
+REQUIRE(rels.size() == 3);

Review Comment:
   I added the "original" default relationship to the python processors as the 
NiFi python processors have it.



##
extensions/python/PythonProcessor.h:
##
@@ -37,7 +38,7 @@ class PythonProcessor {
 
   void setDescription(const std::string& desc);
 
-  void addProperty(const std::string& name, const std::string& description, 
const std::string& defaultvalue, bool required, bool el);
+  void addProperty(const std::string& name, const std::string& description, 
const std::optional& defaultvalue, bool required, bool el, const 
std::optional& validator_value);

Review Comment:
   I added a mapping of the NiFi validators to corresponding MiNiFi Property 
types and represented them with an int value, with the options listed in the 
PropertyTypeCode enum. On the python side it's called a validator, but maybe it 
would be better to change it property_type_code here and there as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-19 Thread via GitHub


szaszm commented on code in PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#discussion_r1459074581


##
libminifi/include/core/ProcessContext.h:
##
@@ -135,6 +135,10 @@ class ProcessContext : public 
controller::ControllerServiceLookup, public core::
 return getProperty(property.name, value);
   }
 
+  virtual bool getProperty(bool /*supports_expression_language*/, 
std::string_view property_name, std::string& value, const 
std::shared_ptr& /*flow_file*/) {
+return getProperty(property_name, value);
+  }

Review Comment:
   Why do we need a new virtual overload? I'd prefer a single virtual function 
as an customization point, and possibly other non-static overloads to handle 
different use cases, to keep things simple. 



##
docker/test/integration/features/environment.py:
##
@@ -47,6 +47,13 @@ def before_scenario(context, scenario):
 
 logging.info("Integration test setup at 
{time:%H:%M:%S.%f}".format(time=datetime.datetime.now()))
 context.test = MiNiFi_integration_test(context=context, 
feature_id=context.feature_id)
+
+if "USE_NIFI_PYTHON_PROCESSORS" in scenario.effective_tags:
+if not context.image_store.is_conda_available_in_minifi_image() and 
context.image_store.get_minifi_image_python_version() < (3, 8, 1):
+scenario.skip("NiFi Python processor tests use langchain library 
therefore Python 3.8.1 or later is needed in the MiNiFi container.")

Review Comment:
   I'd rephrase this.
   ```suggestion
   scenario.skip("NiFi Python processor tests use langchain library 
which requires Python 3.8.1 or later.")
   ```



##
libminifi/include/core/ConfigurableComponent.h:
##
@@ -212,6 +212,8 @@ class ConfigurableComponent {
   // Dynamic properties
   std::map dynamic_properties_;
 
+  virtual Property* findProperty(const std::string& name) const;
+

Review Comment:
   Doesn't getProperty already handle this use case?



##
extensions/python/PythonProcessor.h:
##
@@ -37,7 +38,7 @@ class PythonProcessor {
 
   void setDescription(const std::string& desc);
 
-  void addProperty(const std::string& name, const std::string& description, 
const std::string& defaultvalue, bool required, bool el);
+  void addProperty(const std::string& name, const std::string& description, 
const std::optional& defaultvalue, bool required, bool el, const 
std::optional& validator_value);

Review Comment:
   What's a validator value?



##
extensions/python/tests/PythonManifestTests.cpp:
##
@@ -101,7 +101,7 @@ TEST_CASE("Python processor's description is part of the 
manifest") {
 REQUIRE(getNode(MyPyProc->children, "type").value == 
"org.apache.nifi.minifi.processors.MyPyProc");
 
 auto& rels = getNode(MyPyProc->children, 
"supportedRelationships").children;
-REQUIRE(rels.size() == 2);
+REQUIRE(rels.size() == 3);

Review Comment:
   What changed here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-19 Thread via GitHub


lordgamez commented on PR #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712#issuecomment-1900083750

   I think we should wait for the upcoming NiFi 2.0-M2 release with this PR as 
that release comes with some changes to the python processors: 
https://github.com/apache/nifi/pull/8253
   Although we do not have support for use cases in the MiNiFi C++ manifest, we 
should adjust the NiFi python processor support to not break with these added 
elements.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] MINIFICPP-2276 Support FlowFileTransform NiFi Python processors [nifi-minifi-cpp]

2024-01-08 Thread via GitHub


lordgamez opened a new pull request, #1712:
URL: https://github.com/apache/nifi-minifi-cpp/pull/1712

   https://issues.apache.org/jira/browse/MINIFICPP-2276
   
   -
   Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced
in the commit message?
   
   - [ ] Does your PR title start with MINIFICPP- where  is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically main)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   ### For code changes:
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the LICENSE file?
   - [ ] If applicable, have you updated the NOTICE file?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which 
it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI 
results for build issues and submit an update to your PR as soon as possible.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@nifi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org