[GitHub] [arrow-rs] tustvold edited a comment on pull request #1225: Improve MutableArrayData Null Handling (#1224) (#1230)

GitBox Wed, 26 Jan 2022 03:23:17 -0800


tustvold edited a comment on pull request #1225:
URL: https://github.com/apache/arrow-rs/pull/1225#issuecomment-1022089589



   ```
   cargo criterion --bench filter_kernels
      Compiling arrow v8.0.0 (/home/raphael/repos/external/arrow-rs/arrow)
       Finished bench [optimized] target(s) in 18.23s
   filter u8               time:   [291.13 us 293.93 us 298.49 us]              
        
                           change: [-40.935% -40.686% -40.239%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   filter u8 high selectivity                                                   
                          
                           time:   [5.8296 us 5.8316 us 5.8336 us]
                           change: [-54.079% -53.954% -53.829%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   filter u8 low selectivity                                                    
                         
                           time:   [3.7740 us 3.7783 us 3.7829 us]
                           change: [-12.217% -11.997% -11.788%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   filter context u8       time:   [105.74 us 105.76 us 105.80 us]              
                
                           change: [-63.643% -63.614% -63.586%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   filter context u8 high selectivity                                           
                                  
                           time:   [1.3801 us 1.3816 us 1.3829 us]
                           change: [-82.396% -82.359% -82.319%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   filter context u8 low selectivity                                            
                                
                           time:   [401.67 ns 401.79 ns 401.92 ns]
                           change: [-58.196% -58.112% -58.047%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   filter context u8 w NULLs                                                    
                        
                           time:   [427.53 us 427.66 us 427.80 us]
                           change: [+13.449% +13.527% +13.598%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   
   filter context u8 w NULLs high selectivity                                   
                                          
                           time:   [6.8897 us 6.8919 us 6.8946 us]
                           change: [+0.2869% +0.3711% +0.4592%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   
   filter context u8 w NULLs low selectivity                                    
                                         
                           time:   [1.0082 us 1.0085 us 1.0088 us]
                           change: [+6.1612% +6.4041% +6.5859%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   
   filter f32              time:   [606.18 us 607.55 us 608.93 us]              
         
                           change: [+6.1214% +6.3825% +6.6391%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   
   filter context f32      time:   [427.36 us 428.01 us 429.08 us]              
                 
                           change: [+12.435% +12.609% +12.799%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   
   filter context f32 high selectivity                                          
                                   
                           time:   [12.375 us 12.907 us 13.357 us]
                           change: [+1.5047% +4.5855% +7.1816%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   
   filter context f32 low selectivity                                           
                                  
                           time:   [1.0550 us 1.0552 us 1.0554 us]
                           change: [+8.6226% +9.4838% +10.093%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   
   filter context string   time:   [534.98 us 535.16 us 535.32 us]              
                    
                           change: [+9.7285% +9.8604% +10.001%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   
   filter context string high selectivity                                       
                                     
                           time:   [402.80 us 402.92 us 403.03 us]
                           change: [-2.6457% -2.5796% -2.5140%] (p = 0.00 < 
0.05)
                           Performance has improved.
   
   filter context string low selectivity                                        
                                     
                           time:   [1.3243 us 1.3246 us 1.3249 us]
                           change: [+3.4003% +3.7378% +4.0158%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   
   filter single record batch                                                   
                         
                           time:   [286.47 us 286.77 us 287.09 us]
                           change: [-41.765% -41.668% -41.572%] (p = 0.00 < 
0.05)
                           Performance has improved.
   ```
   
   So it makes filtering arrays without nulls about takes ~50% less time, 
however, it does seem to make filtering arrays with nulls take 10% longer. This 
is likely down to the issue in #1229 , that the extend_bits function is 
ludicrously "hot" for these benchmarks where the runs are typically 1 or 2 
elements long.
   
   I'd personally prefer to merge this as is and keep pushing forward, but I 
can also hold off on this until I've fixed #1229.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] tustvold edited a comment on pull request #1225: Improve MutableArrayData Null Handling (#1224) (#1230)

Reply via email to