HaoYang670 commented on issue #1400:
URL: https://github.com/apache/arrow-rs/issues/1400#issuecomment-1059753488


   > Interesting, I'm actually seeing the opposite effect, with `fold` being 
faster. This is running on an Amd 3700U laptop. The generated code for `fold` 
and `reduce` seems to differ in that `reduce` contains several branches while 
`fold` looks relatively branchless. Different CPUs might handle one or the 
other better, although I would expect the branchless to win in the benchmark 
since the input consists of random numbers. I don't have any good explanation 
why this effects the null handling loop.
   > 
   > `
   
   More interesting! My desktop uses Intel i7 10700K processors. 
   Also, I am curious about why the compiler generates different code for 
`fold` and `reduce`. The `reduce` just uses `fold` in its implementation:
   ```rust
       #[inline]
       #[stable(feature = "iterator_fold_self", since = "1.51.0")]
       fn reduce<F>(mut self, f: F) -> Option<Self::Item>
       where
           Self: Sized,
           F: FnMut(Self::Item, Self::Item) -> Self::Item,
       {
           let first = self.next()?;
           Some(self.fold(first, f))
       }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to