jhorstmann commented on issue #1400:
URL: https://github.com/apache/arrow-rs/issues/1400#issuecomment-1059949067


   I can reproduce both results and it seems AMD cpus are better at handling 
one version of the code and Intel better at the other version. For the non-null 
benchmarks I get:
   
   Intel 10510U
   copy + reduce: ~ 950ns
   reduce references + copy: ~450ns
   
   Amd 3700U (timings fluctuate a bit more on this laptop)
   copy + reduce: ~745ns
   reduce references + copy: ~970ns
   
   For `min nulls` I don't really see significant differences. Code alignment 
might create some small differences in performance on some cpus in such 
microbenchmarks. For even better performance, especially for nullable arrays, 
you should look into enabling the `simd` feature 
   
   I'm now actually a bit worried about the correctness of the nullable 
version, I don't see `has_value` being used apart from the assignment, if that 
flag is false then the result should be `None`.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to