MiNiFi - C++ Devs,

I am currently working on MINIFICPP-49, the expression language feature. While 
expression compilation and evaluation is fairly self-contained, at the very 
least the API to access expression evaluation will touch core components.

Here is how NiFi is currently exposing expression evaluation to processors:

    ...
    try {
                // read the url property from the context
                final String urlstr = 
trimToEmpty(context.getProperty(PROP_URL).evaluateAttributeExpressions(requestFlowFile).getValue());
                final URL url = new URL(urlstr);
    ...

While we have the opportunity now to improve this, we have a couple design 
constraints: the expression code comes from properties, and dynamic evaluation 
of it requires a flow file as input.

Because expressions are defined as processor properties, it is natural to 
expose expression evaluation via the ProcessContext API. The current minifi-cpp 
API to get static properties is as follows:

    bool getProperty(const std::string &name, std::string &value) {
      return processor_node_->getProperty(name, value);
    }

If we do not wish to introduce a Property type with its own 
evaluateAttributeExpressions method, we could simply introduce another 
ProcessContext method for evaluating dynamic properties:

    bool evaluateProperty(const std::string &name, const core::FlowFile 
&flow_file, std::string &value) {
      ...
    }

The implementation of this would compile the expression (the raw value as 
returned by getProperty(...)) if it has not yet been compiled, then evaluate 
the compiled expression against the provided FlowFile. The end result is an API 
similar to, but simpler than, the NiFi interface. The alternative is to provide 
the expression primitives to processors and allow them to manage 
compilation/evaluation on their own. This would increase complexity across all 
processors which support expression properties, which will likely be most 
processors.

The next important question which impacts core minifi is whether or not 
expression language should be an extension. Whether or not it is an extension, 
some kind of standard interface to expressions will need to be made available 
to all processors. Here are the pros/cons of putting it in an extension, as far 
as I can tell:

Pros:

- Reduce compiled size of minifi somewhat (the lexer/parser is currently ~4300 
lines of C++ with no additional library or runtime dependencies) when feature 
is disabled
- Allow for alternate expression language implementations in the future

Cons:

- Additional complexity by needing to add Expression primitives, a standard 
Expression compiler API, dynamic object loading, and an empty (NoOp) 
implementation if the extension is not included
- Additional vtable lookups on an operation which will be invoked very 
frequently (every property lookup on every flow file which supports expressions)
- Makes it harder for gcc/clang/etc. to inline/optimize expression language 
functions
- Core processors (e.g. GetFile/PutFile, where expression language will almost 
certainly be desired for file paths and other properties) will depend on an 
optional extension

I would like to hear feedback from the dev community on these two important 
topics (the interface to the expression language and whether or not the 
implementation should be an extension) before writing the code that touches 
core components. The API question is ultimately more important because it 
touches all current and future processor authors. The decision of whether it is 
an extension or not is more reversible.

Regards,

Andy I.C.

Reply via email to