FrederikP opened a new pull request #3518: NIFI-6322: Introduced 
EvaluationContext to store state while making evaluator tree reusable
URL: https://github.com/apache/nifi/pull/3518
 
 
   #### Description of PR
   
   This is a followup PR to #3500 . It enables true re-usage of the evaluator 
tree once created for a prepared query in nifi's expression language. This, at 
least in our case, saves a ton of CPU.
   
   This PR also includes a lot of tests for functions with stateful evaluators. 
This wasn't covered before.
   
   #3500 was closed by me because the approach I took there was not thread 
safe. Instead of cleaning up the state in evaluators I now introduced a context 
that gets passed through a tree for each evaluation to get rid of state in the 
evaluators itself. All evaluators that need state (mostly for performance 
reasons) can store that state in the context. A new context is created for each 
evaluation. That should also result in lower garbage collection impact, because 
we are only throwing away the state that needs to be thrown away not the whole 
evaluator tree again and again.
   
   Another related pull request is #3277 but that takes a different approach 
and only helps if no stateful evaluators are used in an expression. It also 
doesn't cover `and` + `or` even though they have state. Even if those will be 
excluded from optimization, I don't think it's the best approach because there 
are a ton of expressions (at least in our production scenario) that use 
functions that have stateful evaluators.
   
   Some profiling tests we did show a performance improvement when compared 
with the current master (60b5c13ce95fd4d4a5edf0f08b81af19e71b67ee).
   
   This is the test code:
   
   ```
   @Test
   public void testPerformance() {
       final Map<String, String> attributes = new HashMap<String, String>() {{
           put("hello", "Hello");
           put("boat", "World!");
       }};
       final StandardPreparedQuery prepared = (StandardPreparedQuery) 
Query.prepare("${allAttributes('hello', 
'boat'):isEmpty():not():and(${hello:contains('o')})}");
       for (int i = 0; i < 1000000; i++) {
           assertEquals("true", prepared.evaluateExpressions(attributes, null));
       }
   }
   ```
   
   CPU Time for the evaluation loop (instrumented code):
   current master - 97.48s
   this PR - 23.78s
   
   ->75% performance improvement
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [x] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA 
number you are trying to resolve? Pay particular attention to the hyphen "-" 
character.
   
   - [x] Has your PR been rebased against the latest commit within the target 
branch (typically `master`)?
   
   - [x] Is your initial contribution a single, squashed commit? _Additional 
commits in response to PR reviewer feedback should be made on this branch and 
pushed to allow change tracking. Do not `squash` or use `--force` when pushing 
to allow for clean monitoring of changes._
   
   ### For code changes:
   - [x] Have you ensured that the full suite of tests is executed via `mvn 
-Pcontrib-check clean install` at the root `nifi` folder?
   - [x] Have you written or updated unit tests to verify your changes?
   
   ### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build 
issues and submit an update to your PR as soon as possible.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to