HaoYang670 opened a new issue #1400:
URL: https://github.com/apache/arrow-rs/issues/1400


   **Describe your question**
   I find some interesting benchmark results when I try to speed up the 
function `min_max_help`. 
https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/aggregate.rs#L115-L130
   
   The only thing that I rewrote is replacing `iter().fold()` by 
`iter().reduce()`:
   ```rust
       if null_count == 0 {
           // optimized path for arrays without null values
           m.iter()
               .reduce(|acc, item| if cmp(acc, item) { item } else { acc })
               .copied()
       } else {
           n = T::default_value();
           let mut has_value = false;
           for (i, item) in m.iter().enumerate() {
               if data.is_valid(i) && (!has_value || cmp(&n, item)) {
                   has_value = true;
                   n = *item
               }
           }
           Some(n)
       }
   ```
   
   Then I ran
   ```
   cargo bench min
   ```
   to find if there are any changes in performance.
   And I got the result:
   ```console
   min 512                 time:   [415.18 ns 415.66 ns 416.25 ns]              
      
                           change: [-50.211% -50.109% -50.002%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 15 outliers among 100 measurements (15.00%)
     2 (2.00%) low mild
     7 (7.00%) high mild
     6 (6.00%) high severe
   
   min nulls 512           time:   [1.0308 us 1.0331 us 1.0356 us]              
             
                           change: [+17.936% +18.114% +18.295%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   ```
   The results are a little unexpected. And I have 2 questions:
   1. Why does `min 512` have 50% performance improvement? I don't think 
`iter.reduce` is faster that `iter.fold`
   2. Why does `min nulls 512` become slower? The `null_count > 0` code block 
is not changed.
   
   Need your help!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to