gianm opened a new pull request, #16338: URL: https://github.com/apache/druid/pull/16338
This patch adds a way for columns to provide GroupByVectorColumnSelectors, which controls how the groupBy engine operates on them. This mechanism is used by ExpressionVirtualColumn to provide an ExpressionDeferredGroupByVectorColumnSelector that uses the inputs of an expression as the grouping key. The actual expression evaluation is deferred until the grouped ResultRow is created. A new context parameter "deferExpressionDimensions" allows users to control when this deferred selector is used. The default is "fixedWidthNonNumeric", which is a behavioral change from the prior behavior. Users can get the prior behavior by setting this to "singleString". Benchmarks of a few selected queries from `SqlExpressionBenchmark`: ``` Benchmark (deferExpressionDimensions) (query) (rowsPerSegment) (schema) (vectorize) Mode Cnt Score Error Units SqlExpressionBenchmark.querySql singleString 22 5000000 auto force avgt 5 260.078 ± 14.858 ms/op SqlExpressionBenchmark.querySql fixedWidth 22 5000000 auto force avgt 5 1970.522 ± 58.400 ms/op SqlExpressionBenchmark.querySql fixedWidthNonNumeric 22 5000000 auto force avgt 5 263.535 ± 5.549 ms/op SqlExpressionBenchmark.querySql always 22 5000000 auto force avgt 5 2021.229 ± 125.010 ms/op SqlExpressionBenchmark.querySql singleString 24 5000000 auto force avgt 5 624.300 ± 36.616 ms/op SqlExpressionBenchmark.querySql fixedWidth 24 5000000 auto force avgt 5 889.836 ± 31.123 ms/op SqlExpressionBenchmark.querySql fixedWidthNonNumeric 24 5000000 auto force avgt 5 646.920 ± 24.566 ms/op SqlExpressionBenchmark.querySql always 24 5000000 auto force avgt 5 890.384 ± 53.748 ms/op SqlExpressionBenchmark.querySql singleString 26 5000000 auto force avgt 5 824.417 ± 21.941 ms/op SqlExpressionBenchmark.querySql fixedWidth 26 5000000 auto force avgt 5 244.232 ± 15.514 ms/op SqlExpressionBenchmark.querySql fixedWidthNonNumeric 26 5000000 auto force avgt 5 244.598 ± 14.268 ms/op SqlExpressionBenchmark.querySql always 26 5000000 auto force avgt 5 248.505 ± 8.004 ms/op SqlExpressionBenchmark.querySql singleString 30 5000000 auto force avgt 5 223.687 ± 9.362 ms/op SqlExpressionBenchmark.querySql fixedWidth 30 5000000 auto force avgt 5 562.844 ± 42.288 ms/op SqlExpressionBenchmark.querySql fixedWidthNonNumeric 30 5000000 auto force avgt 5 227.850 ± 3.374 ms/op SqlExpressionBenchmark.querySql always 30 5000000 auto force avgt 5 562.631 ± 69.408 ms/op SqlExpressionBenchmark.querySql singleString 31 5000000 auto force avgt 5 324.208 ± 9.420 ms/op SqlExpressionBenchmark.querySql fixedWidth 31 5000000 auto force avgt 5 1271.630 ± 87.264 ms/op SqlExpressionBenchmark.querySql fixedWidthNonNumeric 31 5000000 auto force avgt 5 323.169 ± 6.383 ms/op SqlExpressionBenchmark.querySql always 31 5000000 auto force avgt 5 1185.118 ± 34.146 ms/op ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
