jhorstmann commented on issue #1400:
URL: https://github.com/apache/arrow-rs/issues/1400#issuecomment-1059743980


   Interesting, I'm actually seeing the opposite effect, with `fold` being 
faster. This is running on an Amd 3700U laptop. The generated code for `fold` 
and `reduce` seems to differ in that `reduce` contains several branches while 
`fold` looks relatively branchless. Different CPUs might handle one or the 
other better, although I would expect the branchless to win in the benchmark 
since the input consists of random numbers. I don't have any good explanation 
why this effects the null handling loop.
   
   `


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to