cyb70289 commented on pull request #8091:
URL: https://github.com/apache/arrow/pull/8091#issuecomment-685348952
Latest benchmark result after re-implementation.
```
# Tested on skylake (knight landing)
$ archery benchmark diff --suite-filter="arrow-compute-aggregate-benchmark"
--benchmark-filter="^Mode" --cc=clang-9 --cxx=clang++-9
benchmark baseline contender change %
// igonre these 100% cases, huge boost due to a simple trick, not very useful
ModeKernelBoolean/1048576/1 123.216 MiB/sec 847.125 GiB/sec
703915.356 null_percent': 100.0
ModeKernelInt8/1048576/1 896.330 MiB/sec 617.997 GiB/sec
70502.192 null_percent': 100.0
ModeKernelInt16/1048576/1 2.886 GiB/sec 965.237 GiB/sec
33340.541 null_percent': 100.0
ModeKernelInt32/1048576/1 5.732 GiB/sec 960.476 GiB/sec
16657.027 null_percent': 100.0
ModeKernelInt64/1048576/1 7.925 GiB/sec 974.705 GiB/sec
12198.487 null_percent': 100.0
// big improvement for int16/32/64 with limited value range
ModeKernelInt16/1048576/0 128.522 MiB/sec 495.771 MiB/sec
285.749 'null_percent': 0.0
ModeKernelInt32/1048576/0 257.694 MiB/sec 953.232 MiB/sec
269.909 'null_percent': 0.0
ModeKernelInt64/1048576/0 516.624 MiB/sec 1.715 GiB/sec
240.027 'null_percent': 0.0
ModeKernelInt32/1048576/10000 227.404 MiB/sec 690.032 MiB/sec
203.439 'null_percent': 0.01
ModeKernelInt16/1048576/10000 115.419 MiB/sec 349.055 MiB/sec
202.425 'null_percent': 0.01
ModeKernelInt32/1048576/100 229.661 MiB/sec 684.149 MiB/sec
197.895 'null_percent': 1.0
ModeKernelInt16/1048576/100 116.084 MiB/sec 342.620 MiB/sec
195.148 'null_percent': 1.0
ModeKernelInt64/1048576/10000 481.409 MiB/sec 1.302 GiB/sec
176.913 'null_percent': 0.01
ModeKernelInt64/1048576/100 486.266 MiB/sec 1.297 GiB/sec
173.114 'null_percent': 1.0
ModeKernelInt16/1048576/10 121.865 MiB/sec 315.932 MiB/sec
159.247 'null_percent': 10.0
ModeKernelInt32/1048576/10 242.074 MiB/sec 625.162 MiB/sec
158.252 'null_percent': 10.0
ModeKernelInt64/1048576/10 527.976 MiB/sec 1.199 GiB/sec
132.580 'null_percent': 10.0
ModeKernelInt32/1048576/2 320.156 MiB/sec 429.196 MiB/sec
34.058 'null_percent': 50.0
ModeKernelInt16/1048576/2 162.121 MiB/sec 196.310 MiB/sec
21.089 'null_percent': 50.0
// no obvious difference for bool/int8
ModeKernelInt8/1048576/100 234.422 MiB/sec 251.464 MiB/sec
7.270 'null_percent': 1.0
ModeKernelInt8/1048576/10 246.324 MiB/sec 258.110 MiB/sec
4.785 'null_percent': 10.0
ModeKernelInt8/1048576/10000 239.496 MiB/sec 250.469 MiB/sec
4.582 'null_percent': 0.01
ModeKernelInt64/1048576/2 812.020 MiB/sec 832.610 MiB/sec
2.536 'null_percent': 50.0
ModeKernelBoolean/1048576/10000 26.318 MiB/sec 26.509 MiB/sec
0.728 'null_percent': 0.01
ModeKernelBoolean/1048576/100 26.510 MiB/sec 26.597 MiB/sec
0.327 'null_percent': 1.0
ModeKernelBoolean/1048576/0 28.271 MiB/sec 28.274 MiB/sec
0.009 'null_percent': 0.0
ModeKernelInt8/1048576/0 270.401 MiB/sec 269.025 MiB/sec
-0.509 'null_percent': 0.0
ModeKernelInt8/1048576/2 190.410 MiB/sec 187.876 MiB/sec
-1.331 'null_percent': 50.0
ModeKernelBoolean/1048576/10 28.007 MiB/sec 27.599 MiB/sec
-1.455 'null_percent': 10.0
ModeKernelBoolean/1048576/2 27.157 MiB/sec 24.209 MiB/sec
-10.857 'null_percent': 50.0
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]