clintropolis commented on PR #12277:
URL: https://github.com/apache/druid/pull/12277#issuecomment-1262944343
I ran a few of the `SqlNestedDataBenchmarks` that were doing string stuff
just to spot check and things look good there too:
```
SELECT string1, SUM(long1) FROM foo GROUP BY 1 ORDER BY 2
SELECT JSON_VALUE(nested, '$.nesteder.string1'), SUM(JSON_VALUE(nested,
'$.long1' RETURNING BIGINT)) FROM foo GROUP BY 1 ORDER BY 2
Benchmark (query) (rowsPerSegment) (stringEncoding)
(vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 6 5000000 none
false avgt 5 229.141 ± 4.357 ms/op
SqlNestedDataBenchmark.querySql 6 5000000 none
force avgt 5 158.286 ± 1.982 ms/op
SqlNestedDataBenchmark.querySql 6 5000000 front-coded-4
false avgt 5 226.019 ± 2.990 ms/op
SqlNestedDataBenchmark.querySql 6 5000000 front-coded-4
force avgt 5 154.666 ± 0.682 ms/op
SqlNestedDataBenchmark.querySql 6 5000000 front-coded-16
false avgt 5 218.805 ± 3.507 ms/op
SqlNestedDataBenchmark.querySql 6 5000000 front-coded-16
force avgt 5 159.220 ± 9.396 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 none
false avgt 5 379.591 ± 6.253 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 none
force avgt 5 196.781 ± 3.562 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 front-coded-4
false avgt 5 369.041 ± 4.383 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 front-coded-4
force avgt 5 197.589 ± 3.049 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 front-coded-16
false avgt 5 379.980 ± 2.840 ms/op
SqlNestedDataBenchmark.querySql 7 5000000 front-coded-16
force avgt 5 198.248 ± 4.503 ms/op
SELECT SUM(long1) FROM foo WHERE string1 = '10000' OR string1 = '1000'
SELECT SUM(JSON_VALUE(nested, '$.long1' RETURNING BIGINT)) FROM foo WHERE
JSON_VALUE(nested, '$.nesteder.string1') = '10000' OR JSON_VALUE(nested,
'$.nesteder.string1') = '1000'
Benchmark (query) (rowsPerSegment) (stringEncoding)
(vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 10 5000000 none
false avgt 5 11.487 ± 0.236 ms/op
SqlNestedDataBenchmark.querySql 10 5000000 none
force avgt 5 11.472 ± 0.201 ms/op
SqlNestedDataBenchmark.querySql 10 5000000 front-coded-4
false avgt 5 11.509 ± 0.198 ms/op
SqlNestedDataBenchmark.querySql 10 5000000 front-coded-4
force avgt 5 11.510 ± 0.297 ms/op
SqlNestedDataBenchmark.querySql 10 5000000 front-coded-16
false avgt 5 11.480 ± 0.288 ms/op
SqlNestedDataBenchmark.querySql 10 5000000 front-coded-16
force avgt 5 11.458 ± 0.270 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 none
false avgt 5 11.650 ± 0.274 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 none
force avgt 5 11.674 ± 0.254 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 front-coded-4
false avgt 5 11.681 ± 0.312 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 front-coded-4
force avgt 5 11.672 ± 0.340 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 front-coded-16
false avgt 5 11.792 ± 0.383 ms/op
SqlNestedDataBenchmark.querySql 11 5000000 front-coded-16
force avgt 5 11.809 ± 0.422 ms/op
SELECT long1, SUM(double3) FROM foo WHERE string1 = '10000' OR string1 =
'1000' GROUP BY 1 ORDER BY 2
SELECT JSON_VALUE(nested, '$.long1' RETURNING BIGINT),
SUM(JSON_VALUE(nested, '$.nesteder.double3' RETURNING DOUBLE)) FROM foo WHERE
JSON_VALUE(nested, '$.nesteder.string1') = '10000' OR JSON_VALUE(nested,
'$.nesteder.string1') = '1000' GROUP BY 1 ORDER BY 2
Benchmark (query) (rowsPerSegment) (stringEncoding)
(vectorize) Mode Cnt Score Error Units
SqlNestedDataBenchmark.querySql 16 5000000 none
false avgt 5 126.009 ± 1.829 ms/op
SqlNestedDataBenchmark.querySql 16 5000000 none
force avgt 5 125.930 ± 2.802 ms/op
SqlNestedDataBenchmark.querySql 16 5000000 front-coded-4
false avgt 5 125.991 ± 1.981 ms/op
SqlNestedDataBenchmark.querySql 16 5000000 front-coded-4
force avgt 5 126.098 ± 4.202 ms/op
SqlNestedDataBenchmark.querySql 16 5000000 front-coded-16
false avgt 5 125.795 ± 6.560 ms/op
SqlNestedDataBenchmark.querySql 16 5000000 front-coded-16
force avgt 5 126.172 ± 3.807 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 none
false avgt 5 126.375 ± 2.382 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 none
force avgt 5 125.585 ± 0.396 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 front-coded-4
false avgt 5 125.678 ± 2.668 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 front-coded-4
force avgt 5 125.355 ± 2.104 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 front-coded-16
false avgt 5 127.011 ± 5.057 ms/op
SqlNestedDataBenchmark.querySql 17 5000000 front-coded-16
force avgt 5 126.835 ± 3.172 ms/op
```
Segment sizes for benchmark are ~3.4G instead of 3.6G, but similar issue
with the `SqlBenchmark` using the data generator in that the data is all
stringified numbers so limited in ability to benefit from this encoding.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]