vertexclique edited a comment on pull request #8664:
URL: https://github.com/apache/arrow/pull/8664#issuecomment-733276442


   Ok, so crucial operations are improved. I've updated the benchmarks. Since 
benchmarks with 512 elements fit most caches it creates unstable benchmarks. 
Kernels only can get better after this PR got merged and they are rewritten 
with parallel iterators. Feel free to benchmark this PR. 
   Things to do after this pr:
   * Other kernels can be improved by different prs
   * Some code can be removed. e.g. `mask_from_u64`. These are not needed and 
will improve performance.
   * Most of the operations are not operating on the larger data so benchmarks 
are kind of not reliable, these also can be changed by yet another PR.
   * We can start writing parallel code.
   
   @nevi-me @jorgecarleitao @alamb
   
   Because bit ops performance has been improved these benchmarks have been 
improved significantly. The rest of the improvements are in the pr description:
   ```
   
        Running 
/home/vertexclique/projects/arrow/rust/target/release/deps/comparison_kernels-f8939ec12975f45e
   eq Float32              time:   [36.413 us 36.429 us 36.447 us]              
          
                           change: [-95.357% -95.353% -95.349%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   eq scalar Float32       time:   [33.540 us 33.551 us 33.562 us]              
                 
                           change: [-94.698% -94.690% -94.684%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high severe
   
   neq Float32             time:   [36.756 us 36.768 us 36.781 us]              
           
                           change: [-94.049% -94.047% -94.045%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     4 (4.00%) high mild
     2 (2.00%) high severe
   
   neq scalar Float32      time:   [30.706 us 30.718 us 30.734 us]              
                  
                           change: [-95.092% -95.089% -95.086%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   lt Float32              time:   [36.489 us 36.498 us 36.509 us]              
          
                           change: [-94.514% -94.504% -94.495%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
     3 (3.00%) high mild
     1 (1.00%) high severe
   
   lt scalar Float32       time:   [30.855 us 30.871 us 30.892 us]              
                 
                           change: [-94.996% -94.993% -94.990%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   lt_eq Float32           time:   [36.478 us 36.492 us 36.508 us]              
             
                           change: [-94.700% -94.697% -94.695%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   lt_eq scalar Float32    time:   [32.641 us 32.653 us 32.668 us]              
                    
                           change: [-95.305% -95.299% -95.292%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   gt Float32              time:   [36.647 us 36.658 us 36.672 us]              
          
                           change: [-94.121% -94.119% -94.116%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) high mild
     1 (1.00%) high severe
   
   gt scalar Float32       time:   [36.541 us 36.562 us 36.583 us]              
                 
                           change: [-94.288% -94.280% -94.273%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) low mild
   
   gt_eq Float32           time:   [36.510 us 36.524 us 36.540 us]              
             
                           change: [-95.402% -95.396% -95.390%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   
   gt_eq scalar Float32    time:   [32.121 us 32.141 us 32.163 us]              
                    
                           change: [-94.877% -94.875% -94.873%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 13 outliers among 100 measurements (13.00%)
     12 (12.00%) low mild
     1 (1.00%) high mild
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to