kgyrtkirk opened a new pull request, #15552:
URL: https://github.com/apache/druid/pull/15552
I was looking into a query which was performing a bit poorly because the
`case_searched` was touching more than `1` columns (if there is only 1 column
there is a cache based evaluator).
While I was doing that I've noticed that there are a few simple things which
could help a bit:
* use a static `TRUE`/`FALSE` instead of creating a new object every time
* create the `ExprEval` early for `ConstantExpr` -s (except the one for
`BigInteger` which seem to have some odd contract)
* return early from type autodetection
these changes mostly reduce the amount of garbage the query creates during
`case_searched` evaluation; although `ExpressionSelectorBenchmark` shows some
improvements `~15%` - but my manual trials on the taxi dataset with 60M rows
showed more improvements - probably due to the fact that these changes mostly
only reduce gc pressure.
<details>
<summary>q1.3</summary>
```sql
SELECT
case when
LOOKUP("dropoff", 'look') = 'x' or
"max_temperature" = '125711' or min_temperature = '11' or min_temperature
= '112' or min_temperature = '1212' or min_temperature = '32' then
min_temperature end,
sum("total_amount"),
count(1)
FROM "trips_xaa"
group by 1
```
execution time improved from `6.69s` to `4.08s`
</details>
<details>
<summary>ExpressionSelectorBenchmark results</summary>
```diff
Benchmark (rowsPerSegment)
Mode Cnt Score Error Units
-ExpressionSelectorBenchmark.arithmeticOnLong 1000000
avgt 10 8.371 ± 0.100 ms/op
+ExpressionSelectorBenchmark.arithmeticOnLong 1000000
avgt 10 8.254 ± 0.041 ms/op
-ExpressionSelectorBenchmark.caseSearched1 1000000
avgt 10 9.612 ± 0.581 ms/op
+ExpressionSelectorBenchmark.caseSearched1 1000000
avgt 10 9.197 ± 0.033 ms/op
-ExpressionSelectorBenchmark.caseSearched2 1000000
avgt 10 275.416 ± 15.150 ms/op
+ExpressionSelectorBenchmark.caseSearched2 1000000
avgt 10 233.649 ± 4.713 ms/op
-ExpressionSelectorBenchmark.caseSearchedWithLookup 1000000
avgt 10 125.643 ± 6.343 ms/op
+ExpressionSelectorBenchmark.caseSearchedWithLookup 1000000
avgt 10 104.813 ± 1.242 ms/op
-ExpressionSelectorBenchmark.caseSearchedWithLookup2 1000000
avgt 10 132.319 ± 10.260 ms/op
+ExpressionSelectorBenchmark.caseSearchedWithLookup2 1000000
avgt 10 116.626 ± 2.644 ms/op
-ExpressionSelectorBenchmark.stringConcatAndCompareOnLong 1000000
avgt 10 8.355 ± 0.078 ms/op
+ExpressionSelectorBenchmark.stringConcatAndCompareOnLong 1000000
avgt 10 8.429 ± 0.119 ms/op
-ExpressionSelectorBenchmark.strlenUsingExpressionAsLong 1000000
avgt 10 9.236 ± 0.205 ms/op
+ExpressionSelectorBenchmark.strlenUsingExpressionAsLong 1000000
avgt 10 9.110 ± 0.084 ms/op
-ExpressionSelectorBenchmark.strlenUsingExpressionAsString 1000000
avgt 10 6.457 ± 0.129 ms/op
+ExpressionSelectorBenchmark.strlenUsingExpressionAsString 1000000
avgt 10 6.389 ± 0.077 ms/op
-ExpressionSelectorBenchmark.strlenUsingExtractionFn 1000000
avgt 10 3.499 ± 0.126 ms/op
+ExpressionSelectorBenchmark.strlenUsingExtractionFn 1000000
avgt 10 3.466 ± 0.041 ms/op
-ExpressionSelectorBenchmark.timeFloorUsingCursor 1000000
avgt 10 8.149 ± 0.035 ms/op
+ExpressionSelectorBenchmark.timeFloorUsingCursor 1000000
avgt 10 8.117 ± 0.052 ms/op
-ExpressionSelectorBenchmark.timeFloorUsingExpression 1000000
avgt 10 9.051 ± 0.106 ms/op
+ExpressionSelectorBenchmark.timeFloorUsingExpression 1000000
avgt 10 8.928 ± 0.058 ms/op
-ExpressionSelectorBenchmark.timeFloorUsingExtractionFn 1000000
avgt 10 7.765 ± 0.167 ms/op
+ExpressionSelectorBenchmark.timeFloorUsingExtractionFn 1000000
avgt 10 7.713 ± 0.125 ms/op
-ExpressionSelectorBenchmark.timeFormatUsingExpression 1000000
avgt 10 10.423 ± 0.158 ms/op
+ExpressionSelectorBenchmark.timeFormatUsingExpression 1000000
avgt 10 9.779 ± 0.167 ms/op
-ExpressionSelectorBenchmark.timeFormatUsingExtractionFn 1000000
avgt 10 7.752 ± 0.134 ms/op
+ExpressionSelectorBenchmark.timeFormatUsingExtractionFn 1000000
avgt 10 7.773 ± 0.103 ms/op
```
</details>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]