jhorstmann commented on PR #7824:
URL: https://github.com/apache/arrow-rs/pull/7824#issuecomment-3028829209
Benchmark results from a local run with `-Ctarget-cpu=native`:
```
write_batch primitive/4096 values primitive
time: [476.56 µs 477.29 µs 478.26 µs]
thrpt: [367.86 MiB/s 368.61 MiB/s 369.18 MiB/s]
change:
time: [-28.207% -27.412% -26.910%] (p = 0.00 <
0.05)
thrpt: [+36.817% +37.763% +39.289%]
Performance has improved.
write_batch primitive/4096 values primitive with bloom filter
time: [4.6939 ms 4.7261 ms 4.7606 ms]
thrpt: [36.956 MiB/s 37.226 MiB/s 37.482 MiB/s]
change:
time: [-2.5791% -1.8121% -1.0755%] (p = 0.00 <
0.05)
thrpt: [+1.0872% +1.8456% +2.6474%]
Performance has improved.
write_batch primitive/4096 values primitive non-null
time: [387.15 µs 389.86 µs 395.46 µs]
thrpt: [436.24 MiB/s 442.50 MiB/s 445.61 MiB/s]
change:
time: [-13.168% -12.894% -12.420%] (p = 0.00 <
0.05)
thrpt: [+14.182% +14.802% +15.165%]
Performance has improved.
write_batch primitive/4096 values primitive non-null with bloom filter
time: [4.6790 ms 4.6942 ms 4.7115 ms]
thrpt: [36.616 MiB/s 36.751 MiB/s 36.870 MiB/s]
change:
time: [-2.2927% -1.8226% -1.3822%] (p = 0.00 <
0.05)
thrpt: [+1.4015% +1.8564% +2.3465%]
Performance has improved.
write_batch primitive/4096 values bool
time: [34.476 µs 34.581 µs 34.717 µs]
thrpt: [30.547 MiB/s 30.667 MiB/s 30.761 MiB/s]
change:
time: [-41.518% -40.820% -40.108%] (p = 0.00 <
0.05)
thrpt: [+66.966% +68.977% +70.994%]
Performance has improved.
write_batch primitive/4096 values bool non-null
time: [20.286 µs 20.328 µs 20.380 µs]
thrpt: [28.077 MiB/s 28.148 MiB/s 28.207 MiB/s]
change:
time: [-28.555% -28.463% -28.362%] (p = 0.00 <
0.05)
thrpt: [+39.591% +39.788% +39.967%]
Performance has improved.
write_batch primitive/4096 values string
time: [512.18 µs 512.45 µs 512.75 µs]
thrpt: [3.9009 GiB/s 3.9032 GiB/s 3.9053 GiB/s]
change:
time: [-5.8849% -5.7654% -5.6374%] (p = 0.00 <
0.05)
thrpt: [+5.9742% +6.1181% +6.2528%]
Performance has improved.
write_batch primitive/4096 values string with bloom filter
time: [990.06 µs 991.34 µs 992.86 µs]
thrpt: [2.0146 GiB/s 2.0177 GiB/s 2.0203 GiB/s]
change:
time: [-4.0814% -3.2002% -2.4531%] (p = 0.00 <
0.05)
thrpt: [+2.5148% +3.3060% +4.2551%]
Performance has improved.
write_batch primitive/4096 values string #2
time: [225.11 µs 225.57 µs 226.14 µs]
thrpt: [558.08 MiB/s 559.49 MiB/s 560.65 MiB/s]
change:
time: [-13.134% -12.724% -12.312%] (p = 0.00 <
0.05)
thrpt: [+14.041% +14.580% +15.119%]
Performance has improved.
write_batch primitive/4096 values string with bloom filter #2
time: [542.18 µs 542.61 µs 543.16 µs]
thrpt: [232.35 MiB/s 232.59 MiB/s 232.77 MiB/s]
change:
time: [-4.4430% -4.1789% -3.9231%] (p = 0.00 <
0.05)
thrpt: [+4.0833% +4.3612% +4.6496%]
Performance has improved.
write_batch primitive/4096 values string dictionary
time: [263.02 µs 263.21 µs 263.44 µs]
thrpt: [3.8264 GiB/s 3.8298 GiB/s 3.8325 GiB/s]
change:
time: [-12.989% -11.939% -11.008%] (p = 0.00 <
0.05)
thrpt: [+12.370% +13.558% +14.928%]
Performance has improved.
write_batch primitive/4096 values string dictionary with bloom filter
time: [502.98 µs 503.29 µs 503.64 µs]
thrpt: [2.0015 GiB/s 2.0029 GiB/s 2.0041 GiB/s]
change:
time: [-5.9241% -5.7646% -5.6083%] (p = 0.00 <
0.05)
thrpt: [+5.9415% +6.1172% +6.2971%]
Performance has improved.
write_batch primitive/4096 values string non-null
time: [689.09 µs 690.48 µs 692.29 µs]
thrpt: [2.8879 GiB/s 2.8954 GiB/s 2.9013 GiB/s]
change:
time: [-1.6423% -1.1567% -0.5306%] (p = 0.00 <
0.05)
thrpt: [+0.5335% +1.1702% +1.6698%]
Change within noise threshold.
write_batch primitive/4096 values string non-null with bloom filter
time: [1.2297 ms 1.2313 ms 1.2334 ms]
thrpt: [1.6209 GiB/s 1.6236 GiB/s 1.6258 GiB/s]
change:
time: [-1.8332% -1.5425% -1.2486%] (p = 0.00 <
0.05)
thrpt: [+1.2644% +1.5667% +1.8674%]
Performance has improved.
write_batch primitive/4096 values float with NaNs
time: [316.17 µs 317.04 µs 318.12 µs]
thrpt: [172.77 MiB/s 173.36 MiB/s 173.84 MiB/s]
change:
time: [-1.5436% -0.5349% +0.5982%] (p = 0.34 >
0.05)
thrpt: [-0.5946% +0.5377% +1.5678%]
No change in performance detected.
write_batch nested/4096 values primitive list
time: [1.1048 ms 1.1141 ms 1.1251 ms]
thrpt: [1.8504 GiB/s 1.8687 GiB/s 1.8843 GiB/s]
change:
time: [-12.662% -12.001% -11.382%] (p = 0.00 <
0.05)
thrpt: [+12.844% +13.638% +14.497%]
Performance has improved.
write_batch nested/4096 values primitive list non-null
time: [1.1968 ms 1.1984 ms 1.2002 ms]
thrpt: [1.7310 GiB/s 1.7336 GiB/s 1.7358 GiB/s]
change:
time: [-2.9200% -2.6182% -2.3050%] (p = 0.00 <
0.05)
thrpt: [+2.3594% +2.6886% +3.0078%]
Performance has improved.
```
Very nice improvements for primitives, minor improvements once bloom filters
or string types are involved. I think the last two benchmarks are named
incorrectly, they actually write 3 columns of types int32, bool and utf8.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]