MiNiFi - C++ Devs,
I am currently working on MINIFICPP-49, the expression language feature. While
expression compilation and evaluation is fairly self-contained, at the very
least the API to access expression evaluation will touch core components.
Here is how NiFi is currently exposing expression evaluation to processors:
...
try {
// read the url property from the context
final String urlstr =
trimToEmpty(context.getProperty(PROP_URL).evaluateAttributeExpressions(requestFlowFile).getValue());
final URL url = new URL(urlstr);
...
While we have the opportunity now to improve this, we have a couple design
constraints: the expression code comes from properties, and dynamic evaluation
of it requires a flow file as input.
Because expressions are defined as processor properties, it is natural to
expose expression evaluation via the ProcessContext API. The current minifi-cpp
API to get static properties is as follows:
bool getProperty(const std::string &name, std::string &value) {
return processor_node_->getProperty(name, value);
}
If we do not wish to introduce a Property type with its own
evaluateAttributeExpressions method, we could simply introduce another
ProcessContext method for evaluating dynamic properties:
bool evaluateProperty(const std::string &name, const core::FlowFile
&flow_file, std::string &value) {
...
}
The implementation of this would compile the expression (the raw value as
returned by getProperty(...)) if it has not yet been compiled, then evaluate
the compiled expression against the provided FlowFile. The end result is an API
similar to, but simpler than, the NiFi interface. The alternative is to provide
the expression primitives to processors and allow them to manage
compilation/evaluation on their own. This would increase complexity across all
processors which support expression properties, which will likely be most
processors.
The next important question which impacts core minifi is whether or not
expression language should be an extension. Whether or not it is an extension,
some kind of standard interface to expressions will need to be made available
to all processors. Here are the pros/cons of putting it in an extension, as far
as I can tell:
Pros:
- Reduce compiled size of minifi somewhat (the lexer/parser is currently ~4300
lines of C++ with no additional library or runtime dependencies) when feature
is disabled
- Allow for alternate expression language implementations in the future
Cons:
- Additional complexity by needing to add Expression primitives, a standard
Expression compiler API, dynamic object loading, and an empty (NoOp)
implementation if the extension is not included
- Additional vtable lookups on an operation which will be invoked very
frequently (every property lookup on every flow file which supports expressions)
- Makes it harder for gcc/clang/etc. to inline/optimize expression language
functions
- Core processors (e.g. GetFile/PutFile, where expression language will almost
certainly be desired for file paths and other properties) will depend on an
optional extension
I would like to hear feedback from the dev community on these two important
topics (the interface to the expression language and whether or not the
implementation should be an extension) before writing the code that touches
core components. The API question is ultimately more important because it
touches all current and future processor authors. The decision of whether it is
an extension or not is more reversible.
Regards,
Andy I.C.