jhorstmann commented on issue #1400: URL: https://github.com/apache/arrow-rs/issues/1400#issuecomment-1059949067
I can reproduce both results and it seems AMD cpus are better at handling one version of the code and Intel better at the other version. For the non-null benchmarks I get: Intel 10510U copy + reduce: ~ 950ns reduce references + copy: ~450ns Amd 3700U (timings fluctuate a bit more on this laptop) copy + reduce: ~745ns reduce references + copy: ~970ns For `min nulls` I don't really see significant differences. Code alignment might create some small differences in performance on some cpus in such microbenchmarks. For even better performance, especially for nullable arrays, you should look into enabling the `simd` feature I'm now actually a bit worried about the correctness of the nullable version, I don't see `has_value` being used apart from the assignment, if that flag is false then the result should be `None`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
