[jira] [Updated] (NIFI-6322) Evaluator Objects are rebuilt on every call even when a CompiledExpression is used

Mark Payne (JIRA) Thu, 13 Jun 2019 08:41:41 -0700


     [ 
https://issues.apache.org/jira/browse/NIFI-6322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mark Payne updated NIFI-6322:
-----------------------------
    Fix Version/s: 1.10.0
           Status: Patch Available  (was: Open)

> Evaluator Objects are rebuilt on every call even when a CompiledExpression is 
> used
> ----------------------------------------------------------------------------------
>
>                 Key: NIFI-6322
>                 URL: https://issues.apache.org/jira/browse/NIFI-6322
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.9.2
>            Reporter: Frederik Petersen
>            Priority: Major
>              Labels: expression-language, performance
>             Fix For: 1.10.0
>
>         Attachments: Selection_094.png, image.png
>
>          Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Hi, 
> While doing some CPU sampling in our production environment, we encountered 
> some strange results. It seems like that, during the evaluation of NiFi 
> expressions, the modification of a _HashSet_ is the most expensive operation 
> in this process.
> !Selection_094.png!
> This feels pretty unrealistic considering all the other processing related to 
> evaluating NiFi expressions. 
>  After reviewing some code and some profiling it just looks like this 
> _HashSet_ modification is performed way more often than required. Especially 
> that it is done at each evaluation.
> !image.png!
>  This profiling output was produced with the following unit test:
> {code:java}
> @Test
> public void testSimple() {
>  final TestRunner runner = TestRunners.newTestRunner(new RouteOnAttribute());
>  runner.setProperty(RouteOnAttribute.ROUTE_STRATEGY, 
> RouteOnAttribute.ROUTE_ANY_MATCHES.getValue());
>  runner.setProperty("filter", "${literal('b'):equals(${a})}");
>  for (int i = 0; i < 500; i++) {
>  runner.enqueue(new byte[0], new HashMap<String, String>() {{
>  put("a", "b");
>  }});
>  }
>  runner.run(500);
> }{code}
> The key question is: Why are the _Evaluator_ Objects (and all the stuff 
> related to it) built twice:
>  - Once in _ExpressionCompiler.compile()_
>  - Once again in _CompiledExpression.evaluate()_
> In other words: Every call to _CompiledExpression.evaluate()_ leads to a new 
> _ExpressionCompiler_ being created and expensive calls being made. Why not 
> just reuse _Evaluator_ objects created beforehand that are stored in the 
> _CompiledExpression_?
> Is there a specific design decision behind that? It looks like there is room 
> for performance improvement, especially for heavily used processors.
> On our live system, where we perform expensive tasks like language detection, 
> mail parsing and such, this situation causes the most amount of CPU eaten by 
> the expression language evaluation.
> Thank you very much for looking into this.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (NIFI-6322) Evaluator Objects are rebuilt on every call even when a CompiledExpression is used

Reply via email to