allenma commented on PR #5939:
URL:
https://github.com/apache/arrow-datafusion/pull/5939#issuecomment-1502685853
@Dandandan @ozankabak , I did the benchmark with the new implementation,
actually there is little performance downgrade:
```
Benchmarking aggregate_query_no_group_by_count_distinct_wide: Warming up for
3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase
target time to 61.0s, or reduce sample count to 10.
aggregate_query_no_group_by_count_distinct_wide
time: [587.01 ms 598.43 ms 611.68 ms]
change: [-6.8593% -3.2992% +0.1757%] (p = 0.08 >
0.05)
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
Benchmarking aggregate_query_no_group_by_count_distinct_narrow: Warming up
for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase
target time to 40.1s, or reduce sample count to 10.
aggregate_query_no_group_by_count_distinct_narrow
time: [399.48 ms 415.63 ms 438.63 ms]
change: [-1.1234% +3.7277% +10.592%] (p = 0.20 >
0.05)
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) high mild
3 (3.00%) high severe
```
I increase the test array size from 65536 to 134_217_728 to reduce the env
noise, and the benchmark command is:
cargo bench --bench aggregate_query_sql
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]