jihoonson opened a new pull request #9487: Convert array_contains() and array_overlaps() into native filters if possible URL: https://github.com/apache/druid/pull/9487 ### Description The native filter is way faster than the expression filter. Here is a `ExpressionFilterBenchmark` result. ``` Benchmark (rowsPerSegment) Mode Cnt Score Error Units ExpressionFilterBenchmark.expressionFilter 1000000 avgt 30 255.291 ± 1.034 ms/op ExpressionFilterBenchmark.nativeFilter 1000000 avgt 30 1.868 ± 0.005 ms/op ``` This PR adds an optimization that transforms `array_contains()` and `array_overlaps()` into native filters if possible. For now, the optimization will be applied only when their parameters are a simple extraction and a literal. This is because the facility that traverses an `Expr` tree and converts it to a tree of native filters is missing. I think we could possibly add an optimization layer on the native query or between the sql layer and the native query layer, but it's not in the scope of this PR. I also added the behavior of the `IN` filter on multi-valued dimensions in the doc. <hr> This PR has: - [x] been self-reviewed. - [x] added documentation for new or modified features or behaviors. - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths. <hr> ##### Key changed/added classes in this PR * `ArrayOverlapOperatorConversion` * `ArrayContainsOperatorConversion`
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
