jhorstmann opened a new pull request #8685:
URL: https://github.com/apache/arrow/pull/8685


   This refactors the simd aggregation to a reusable trait and adds 
implementations for min and max.
   
   Some tests were failing with the simd feature active because the different 
order of additions rounded to a slightly different result. I reused the 
comparison function that @vertexclique implemented in his recent PR.
   
   Microbenchmarks show a 9x-12x improvement:
   
   ```
   $ cargo bench --features simd --bench aggregate_kernels
      Compiling arrow v3.0.0-SNAPSHOT 
(/home/joernhorstmann/Source/github/apache/arrow/rust/arrow)
       Finished bench [optimized] target(s) in 47.73s
        Running 
/home/joernhorstmann/Source/github/apache/arrow/rust/target/release/deps/aggregate_kernels-717937ec43706892
   Gnuplot not found, using plotters backend
   sum 512                 time:   [75.840 ns 75.870 ns 75.903 ns]              
      
                           change: [+0.3060% +0.3962% +0.4914%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 18 outliers among 100 measurements (18.00%)
     3 (3.00%) low severe
     2 (2.00%) low mild
     7 (7.00%) high mild
     6 (6.00%) high severe
   
   min 512                 time:   [79.360 ns 79.422 ns 79.498 ns]              
      
                           change: [-89.496% -89.483% -89.468%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 16 outliers among 100 measurements (16.00%)
     4 (4.00%) high mild
     12 (12.00%) high severe
   
   sum nulls 512           time:   [136.84 ns 136.93 ns 137.04 ns]              
            
                           change: [+1.5542% +2.1189% +2.7055%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 15 outliers among 100 measurements (15.00%)
     1 (1.00%) low severe
     1 (1.00%) low mild
     3 (3.00%) high mild
     10 (10.00%) high severe
   
   min nulls 512           time:   [177.30 ns 177.38 ns 177.47 ns]              
            
                           change: [-92.219% -92.206% -92.194%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 17 outliers among 100 measurements (17.00%)
     1 (1.00%) low mild
     5 (5.00%) high mild
     11 (11.00%) high severe
   ```
   
   A datafusion benchmark using a global aggregation `SELECT MIN(f64), MAX(f64) 
FROM t` also shows a 2x improvement between running with and without feature.
   
   ```
        Running 
/home/joernhorstmann/Source/github/apache/arrow/rust/target/release/deps/aggregate_query_sql-034b2ab6143485ec
   Gnuplot not found, using plotters backend
   aggregate_query_no_group_by_min_max_f64                                      
                                      
                           time:   [373.77 us 383.51 us 395.09 us]
                           change: [-59.420% -56.845% -53.563%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 18 outliers among 100 measurements (18.00%)
     2 (2.00%) low mild
     1 (1.00%) high mild
     15 (15.00%) high severe
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to