crepererum commented on PR #7463:
URL: https://github.com/apache/arrow-rs/pull/7463#issuecomment-2872919830

   I ran 0a01037047dfa4a9425debbcad70e5fd69e24ab1 (= main) against 
2f2905eafec8e924efd846d0340b93cee8de532c (= branch/PR) in the order main, 
branch, main, branch:
   
   <details>
   
   ```text
   group                                                                        
 branch                                 branch2                                
main                                   main2
   -----                                                                        
 ------                                 -------                                
----                                   -----
   filter context decimal128 (kept 1/2)                                         
 1.00     13.5±0.09µs        ? ?/sec    1.03     14.0±0.28µs        ? ?/sec    
1.01     13.7±0.16µs        ? ?/sec    1.05     14.2±0.32µs        ? ?/sec
   filter context decimal128 high selectivity (kept 1023/1024)                  
 1.05     16.0±0.33µs        ? ?/sec    1.01     15.5±0.34µs        ? ?/sec    
1.00     15.3±0.34µs        ? ?/sec    1.07     16.3±0.09µs        ? ?/sec
   filter context decimal128 low selectivity (kept 1/1024)                      
 1.04    127.3±1.45ns        ? ?/sec    1.10    135.8±0.38ns        ? ?/sec    
1.00    122.9±1.17ns        ? ?/sec    1.06    130.5±1.16ns        ? ?/sec
   filter context f32 (kept 1/2)                                                
 1.00     27.9±0.38µs        ? ?/sec    1.02     28.3±0.69µs        ? ?/sec    
1.03     28.8±0.78µs        ? ?/sec    1.05     29.2±0.73µs        ? ?/sec
   filter context f32 high selectivity (kept 1023/1024)                         
 1.01      6.5±0.05µs        ? ?/sec    1.04      6.7±0.07µs        ? ?/sec    
1.00      6.5±0.03µs        ? ?/sec    1.06      6.9±0.14µs        ? ?/sec
   filter context f32 low selectivity (kept 1/1024)                             
 1.14    248.5±3.95ns        ? ?/sec    1.02    222.1±3.89ns        ? ?/sec    
1.12    243.1±0.73ns        ? ?/sec    1.00    218.0±4.26ns        ? ?/sec
   filter context fsb with value length 20 (kept 1/2)                           
 1.00     19.6±0.16µs        ? ?/sec    1.01     19.8±0.50µs        ? ?/sec    
1.00     19.5±0.09µs        ? ?/sec    1.01     19.7±0.28µs        ? ?/sec
   filter context fsb with value length 20 high selectivity (kept 1023/1024)    
 1.00     19.5±0.06µs        ? ?/sec    1.02     19.8±0.47µs        ? ?/sec    
1.00     19.6±0.19µs        ? ?/sec    1.00     19.6±0.17µs        ? ?/sec
   filter context fsb with value length 20 low selectivity (kept 1/1024)        
 1.00     19.5±0.10µs        ? ?/sec    1.01     19.7±0.19µs        ? ?/sec    
1.00     19.6±0.19µs        ? ?/sec    1.01     19.7±0.33µs        ? ?/sec
   filter context fsb with value length 5 (kept 1/2)                            
 1.00     19.5±0.06µs        ? ?/sec    1.02     19.8±0.51µs        ? ?/sec    
1.00     19.6±0.18µs        ? ?/sec    1.01     19.7±0.31µs        ? ?/sec
   filter context fsb with value length 5 high selectivity (kept 1023/1024)     
 1.00     19.6±0.30µs        ? ?/sec    1.01     19.8±0.48µs        ? ?/sec    
1.00     19.6±0.15µs        ? ?/sec    1.02     20.0±0.39µs        ? ?/sec
   filter context fsb with value length 5 low selectivity (kept 1/1024)         
 1.00     19.5±0.14µs        ? ?/sec    1.01     19.8±0.45µs        ? ?/sec    
1.00     19.6±0.17µs        ? ?/sec    1.01     19.8±0.33µs        ? ?/sec
   filter context fsb with value length 50 (kept 1/2)                           
 1.00     19.5±0.08µs        ? ?/sec    1.02     19.9±0.47µs        ? ?/sec    
1.00     19.5±0.08µs        ? ?/sec    1.00     19.6±0.24µs        ? ?/sec
   filter context fsb with value length 50 high selectivity (kept 1023/1024)    
 1.00     19.5±0.02µs        ? ?/sec    1.00     19.6±0.18µs        ? ?/sec    
1.00     19.6±0.17µs        ? ?/sec    1.02     19.9±0.61µs        ? ?/sec
   filter context fsb with value length 50 low selectivity (kept 1/1024)        
 1.00     19.5±0.14µs        ? ?/sec    1.03     20.1±0.67µs        ? ?/sec    
1.00     19.5±0.13µs        ? ?/sec    1.01     19.6±0.23µs        ? ?/sec
   filter context i32 (kept 1/2)                                                
 1.00      9.4±0.03µs        ? ?/sec    1.01      9.5±0.19µs        ? ?/sec    
1.01      9.5±0.03µs        ? ?/sec    1.02      9.6±0.40µs        ? ?/sec
   filter context i32 high selectivity (kept 1023/1024)                         
 1.02      3.3±0.01µs        ? ?/sec    1.00      3.2±0.04µs        ? ?/sec    
1.05      3.4±0.09µs        ? ?/sec    1.03      3.3±0.05µs        ? ?/sec
   filter context i32 low selectivity (kept 1/1024)                             
 1.00    122.5±1.49ns        ? ?/sec    1.08    132.2±1.98ns        ? ?/sec    
1.02    124.5±0.86ns        ? ?/sec    1.06    129.6±2.23ns        ? ?/sec
   filter context i32 w NULLs (kept 1/2)                                        
 1.00     29.3±0.35µs        ? ?/sec    1.02     29.8±0.52µs        ? ?/sec    
1.01     29.4±0.43µs        ? ?/sec    1.02     30.0±0.95µs        ? ?/sec
   filter context i32 w NULLs high selectivity (kept 1023/1024)                 
 1.00      6.5±0.09µs        ? ?/sec    1.04      6.8±0.10µs        ? ?/sec    
1.01      6.6±0.10µs        ? ?/sec    1.01      6.5±0.12µs        ? ?/sec
   filter context i32 w NULLs low selectivity (kept 1/1024)                     
 1.03    219.1±1.29ns        ? ?/sec    1.06    224.5±3.06ns        ? ?/sec    
1.00    212.1±0.82ns        ? ?/sec    1.22    259.1±3.80ns        ? ?/sec
   filter context mixed string view (kept 1/2)                                  
 1.00    286.3±3.77µs        ? ?/sec    1.02   291.3±14.86µs        ? ?/sec    
1.01    288.3±3.31µs        ? ?/sec    1.03    296.1±3.97µs        ? ?/sec
   filter context mixed string view high selectivity (kept 1023/1024)           
 1.01   520.8±23.02µs        ? ?/sec    1.01    521.6±9.20µs        ? ?/sec    
1.00    517.3±4.90µs        ? ?/sec    1.03    531.9±9.74µs        ? ?/sec
   filter context mixed string view low selectivity (kept 1/1024)               
 1.05    485.8±3.65ns        ? ?/sec    1.04    481.4±7.06ns        ? ?/sec    
1.00    461.4±4.36ns        ? ?/sec    1.04    478.6±6.81ns        ? ?/sec
   filter context short string view (kept 1/2)                                  
 1.00    167.8±5.31µs        ? ?/sec    1.08    181.8±3.28µs        ? ?/sec    
1.10    184.0±2.27µs        ? ?/sec    1.11    187.0±3.75µs        ? ?/sec
   filter context short string view high selectivity (kept 1023/1024)           
 1.00    308.0±2.22µs        ? ?/sec    1.08    331.3±7.06µs        ? ?/sec    
1.06    326.3±3.51µs        ? ?/sec    1.09    335.4±5.51µs        ? ?/sec
   filter context short string view low selectivity (kept 1/1024)               
 1.01    397.8±0.76ns        ? ?/sec    1.03    407.1±8.52ns        ? ?/sec    
1.00    394.1±2.69ns        ? ?/sec    1.04    409.4±6.66ns        ? ?/sec
   filter context string (kept 1/2)                                             
 1.00    317.1±3.54µs        ? ?/sec    1.06    335.2±8.25µs        ? ?/sec    
1.06    337.2±2.69µs        ? ?/sec    1.06    335.7±5.15µs        ? ?/sec
   filter context string dictionary (kept 1/2)                                  
 1.00     97.1±0.84µs        ? ?/sec    1.00     97.1±0.65µs        ? ?/sec    
1.01     97.7±0.56µs        ? ?/sec    1.01     98.2±1.13µs        ? ?/sec
   filter context string dictionary high selectivity (kept 1023/1024)           
 1.00    102.7±0.89µs        ? ?/sec    1.00    102.8±0.99µs        ? ?/sec    
1.00    102.6±0.52µs        ? ?/sec    1.00    102.7±1.89µs        ? ?/sec
   filter context string dictionary low selectivity (kept 1/1024)               
 1.01     75.8±0.69µs        ? ?/sec    1.01     76.1±1.11µs        ? ?/sec    
1.00     75.3±0.87µs        ? ?/sec    1.02     76.8±1.52µs        ? ?/sec
   filter context string dictionary w NULLs (kept 1/2)                          
 1.00     84.9±0.36µs        ? ?/sec    1.00     85.3±0.79µs        ? ?/sec    
1.01     86.1±0.27µs        ? ?/sec    1.02     86.7±1.43µs        ? ?/sec
   filter context string dictionary w NULLs high selectivity (kept 1023/1024)   
 1.00    164.8±6.88µs        ? ?/sec    1.05   173.6±10.04µs        ? ?/sec    
1.02    168.7±1.23µs        ? ?/sec    1.03   169.0±10.26µs        ? ?/sec
   filter context string dictionary w NULLs low selectivity (kept 1/1024)       
 1.01     37.8±0.33µs        ? ?/sec    1.02     38.2±0.71µs        ? ?/sec    
1.00     37.5±0.26µs        ? ?/sec    1.01     38.0±0.42µs        ? ?/sec
   filter context string high selectivity (kept 1023/1024)                      
 1.00    339.1±4.13µs        ? ?/sec    1.04   351.1±11.57µs        ? ?/sec    
1.01    341.1±4.78µs        ? ?/sec    1.04    351.4±8.22µs        ? ?/sec
   filter context string low selectivity (kept 1/1024)                          
 1.00    689.9±1.35ns        ? ?/sec    1.03   708.1±14.90ns        ? ?/sec    
2.20  1518.0±41.07ns        ? ?/sec    2.25  1554.6±36.54ns        ? ?/sec
   filter context u8 (kept 1/2)                                                 
 1.00     11.9±0.02µs        ? ?/sec    1.01     12.0±0.18µs        ? ?/sec    
1.01     12.0±0.09µs        ? ?/sec    1.01     12.0±0.13µs        ? ?/sec
   filter context u8 high selectivity (kept 1023/1024)                          
 1.00    958.8±1.51ns        ? ?/sec    1.04  1001.9±72.04ns        ? ?/sec    
1.00   963.3±16.55ns        ? ?/sec    1.01   973.0±11.91ns        ? ?/sec
   filter context u8 low selectivity (kept 1/1024)                              
 1.00    123.3±0.74ns        ? ?/sec    1.09    134.3±2.40ns        ? ?/sec    
1.01    124.1±1.06ns        ? ?/sec    1.08    132.5±3.34ns        ? ?/sec
   filter context u8 w NULLs (kept 1/2)                                         
 1.00     31.7±0.12µs        ? ?/sec    1.01     32.1±0.51µs        ? ?/sec    
1.00     31.8±0.29µs        ? ?/sec    1.01     32.1±0.50µs        ? ?/sec
   filter context u8 w NULLs high selectivity (kept 1023/1024)                  
 1.02      4.3±0.02µs        ? ?/sec    1.03      4.3±0.05µs        ? ?/sec    
1.00      4.2±0.06µs        ? ?/sec    1.02      4.3±0.10µs        ? ?/sec
   filter context u8 w NULLs low selectivity (kept 1/1024)                      
 1.12    254.2±1.66ns        ? ?/sec    1.00    226.6±2.74ns        ? ?/sec    
1.12    253.4±1.23ns        ? ?/sec    1.16    263.2±4.00ns        ? ?/sec
   filter decimal128 (kept 1/2)                                                 
 1.00     46.0±0.24µs        ? ?/sec    1.00     46.1±1.15µs        ? ?/sec    
1.19     54.9±0.21µs        ? ?/sec    1.20     55.4±2.17µs        ? ?/sec
   filter decimal128 high selectivity (kept 1023/1024)                          
 1.01     16.5±0.30µs        ? ?/sec    1.03     16.8±0.55µs        ? ?/sec    
1.00     16.3±0.46µs        ? ?/sec    1.04     16.9±0.57µs        ? ?/sec
   filter decimal128 low selectivity (kept 1/1024)                              
 1.00  1274.0±23.10ns        ? ?/sec    1.00  1274.3±19.73ns        ? ?/sec    
1.02  1298.1±14.76ns        ? ?/sec    1.02   1293.8±4.79ns        ? ?/sec
   filter f32 (kept 1/2)                                                        
 1.00     94.6±1.05µs        ? ?/sec    1.01     95.3±1.10µs        ? ?/sec    
1.07    101.3±0.87µs        ? ?/sec    1.08    101.9±1.06µs        ? ?/sec
   filter fsb with value length 20 (kept 1/2)                                   
 1.00    104.6±1.61µs        ? ?/sec    1.03    108.0±3.53µs        ? ?/sec    
1.04    108.6±2.28µs        ? ?/sec    1.05    109.5±3.77µs        ? ?/sec
   filter fsb with value length 20 high selectivity (kept 1023/1024)            
 1.02     20.3±0.16µs        ? ?/sec    1.02     20.2±0.49µs        ? ?/sec    
1.00     19.9±0.21µs        ? ?/sec    1.02     20.3±0.44µs        ? ?/sec
   filter fsb with value length 20 low selectivity (kept 1/1024)                
 1.00  1364.4±15.87ns        ? ?/sec    1.01  1371.6±28.92ns        ? ?/sec    
1.01   1373.9±5.76ns        ? ?/sec    1.01  1376.6±25.39ns        ? ?/sec
   filter fsb with value length 5 (kept 1/2)                                    
 1.01    108.3±5.62µs        ? ?/sec    1.02    109.6±1.31µs        ? ?/sec    
1.01    108.3±0.45µs        ? ?/sec    1.00    107.5±6.54µs        ? ?/sec
   filter fsb with value length 5 high selectivity (kept 1023/1024)             
 1.04      5.5±0.06µs        ? ?/sec    1.02      5.4±0.12µs        ? ?/sec    
1.00      5.3±0.11µs        ? ?/sec    1.04      5.5±0.11µs        ? ?/sec
   filter fsb with value length 5 low selectivity (kept 1/1024)                 
 1.03  1394.2±27.28ns        ? ?/sec    1.02  1382.4±34.15ns        ? ?/sec    
1.00   1355.2±6.93ns        ? ?/sec    1.03  1395.6±70.56ns        ? ?/sec
   filter fsb with value length 50 (kept 1/2)                                   
 1.00    106.2±1.18µs        ? ?/sec    1.01    107.3±2.11µs        ? ?/sec    
1.01    107.4±2.67µs        ? ?/sec    1.04    110.9±5.04µs        ? ?/sec
   filter fsb with value length 50 high selectivity (kept 1023/1024)            
 1.01     47.9±0.75µs        ? ?/sec    1.03     48.9±1.28µs        ? ?/sec    
1.00     47.4±1.03µs        ? ?/sec    1.02     48.4±1.14µs        ? ?/sec
   filter fsb with value length 50 low selectivity (kept 1/1024)                
 1.01  1389.1±17.89ns        ? ?/sec    1.00  1378.4±25.37ns        ? ?/sec    
1.01  1390.5±30.70ns        ? ?/sec    1.00  1377.9±21.72ns        ? ?/sec
   filter i32 (kept 1/2)                                                        
 1.02     39.3±0.08µs        ? ?/sec    1.00     38.6±0.42µs        ? ?/sec    
1.34     51.9±0.25µs        ? ?/sec    1.33     51.5±0.68µs        ? ?/sec
   filter i32 high selectivity (kept 1023/1024)                                 
 1.03      4.4±0.03µs        ? ?/sec    1.00      4.3±0.03µs        ? ?/sec    
1.05      4.5±0.06µs        ? ?/sec    1.05      4.5±0.06µs        ? ?/sec
   filter i32 low selectivity (kept 1/1024)                                     
 1.00  1254.6±11.33ns        ? ?/sec    1.01   1266.0±4.92ns        ? ?/sec    
1.02   1279.2±3.64ns        ? ?/sec    1.03  1290.7±24.25ns        ? ?/sec
   filter optimize (kept 1/2)                                                   
 1.00     44.3±0.07µs        ? ?/sec    1.03     45.7±1.05µs        ? ?/sec    
1.14     50.7±0.38µs        ? ?/sec    1.14     50.6±0.83µs        ? ?/sec
   filter optimize high selectivity (kept 1023/1024)                            
 1.05  1296.2±41.24ns        ? ?/sec    1.00  1232.1±29.19ns        ? ?/sec    
1.03   1269.0±8.19ns        ? ?/sec    1.02  1261.2±28.67ns        ? ?/sec
   filter optimize low selectivity (kept 1/1024)                                
 1.00   1159.6±1.91ns        ? ?/sec    1.02  1177.5±23.36ns        ? ?/sec    
1.01   1170.0±2.34ns        ? ?/sec    1.03  1190.0±26.64ns        ? ?/sec
   filter run array (kept 1/2)                                                  
 1.00    167.5±1.25µs        ? ?/sec    1.05    175.7±2.53µs        ? ?/sec    
1.07    179.2±1.52µs        ? ?/sec    1.07    179.8±1.65µs        ? ?/sec
   filter run array high selectivity (kept 1023/1024)                           
 1.00    161.0±2.35µs        ? ?/sec    1.00    161.1±1.93µs        ? ?/sec    
1.01    161.9±6.56µs        ? ?/sec    1.02    163.6±2.40µs        ? ?/sec
   filter run array low selectivity (kept 1/1024)                               
 1.00    103.5±0.48µs        ? ?/sec    1.02    105.2±3.34µs        ? ?/sec    
1.01    104.8±1.42µs        ? ?/sec    1.02    105.2±1.41µs        ? ?/sec
   filter single record batch                                                   
 1.00     44.5±0.76µs        ? ?/sec    1.01     45.0±0.72µs        ? ?/sec    
1.15     51.2±0.25µs        ? ?/sec    1.20     53.3±0.90µs        ? ?/sec
   filter u8 (kept 1/2)                                                         
 1.00     43.4±0.22µs        ? ?/sec    1.02     44.1±0.82µs        ? ?/sec    
1.19     51.6±0.60µs        ? ?/sec    1.22     53.0±0.94µs        ? ?/sec
   filter u8 high selectivity (kept 1023/1024)                                  
 1.01      2.0±0.01µs        ? ?/sec    1.00  1994.8±39.88ns        ? ?/sec    
1.02      2.0±0.02µs        ? ?/sec    1.01      2.0±0.04µs        ? ?/sec
   filter u8 low selectivity (kept 1/1024)                                      
 1.01   1292.7±1.92ns        ? ?/sec    1.01   1296.4±7.67ns        ? ?/sec    
1.00   1281.4±5.51ns        ? ?/sec    1.00  1279.9±18.97ns        ? ?/sec
   
   ```
   
   </details>
   
   and here w/ a 10% threshold:
   
   <details>
   
   ```text
   group                                                       branch           
                      branch2                                main               
                    main2
   -----                                                       ------           
                      -------                                ----               
                    -----
   filter context decimal128 low selectivity (kept 1/1024)     1.04    
127.3±1.45ns        ? ?/sec    1.10    135.8±0.38ns        ? ?/sec    1.00    
122.9±1.17ns        ? ?/sec    1.06    130.5±1.16ns        ? ?/sec
   filter context f32 low selectivity (kept 1/1024)            1.14    
248.5±3.95ns        ? ?/sec    1.02    222.1±3.89ns        ? ?/sec    1.12    
243.1±0.73ns        ? ?/sec    1.00    218.0±4.26ns        ? ?/sec
   filter context i32 w NULLs low selectivity (kept 1/1024)    1.03    
219.1±1.29ns        ? ?/sec    1.06    224.5±3.06ns        ? ?/sec    1.00    
212.1±0.82ns        ? ?/sec    1.22    259.1±3.80ns        ? ?/sec
   filter context short string view (kept 1/2)                 1.00    
167.8±5.31µs        ? ?/sec    1.08    181.8±3.28µs        ? ?/sec    1.10    
184.0±2.27µs        ? ?/sec    1.11    187.0±3.75µs        ? ?/sec
   filter context string low selectivity (kept 1/1024)         1.00    
689.9±1.35ns        ? ?/sec    1.03   708.1±14.90ns        ? ?/sec    2.20  
1518.0±41.07ns        ? ?/sec    2.25  1554.6±36.54ns        ? ?/sec
   filter context u8 w NULLs low selectivity (kept 1/1024)     1.12    
254.2±1.66ns        ? ?/sec    1.00    226.6±2.74ns        ? ?/sec    1.12    
253.4±1.23ns        ? ?/sec    1.16    263.2±4.00ns        ? ?/sec
   filter decimal128 (kept 1/2)                                1.00     
46.0±0.24µs        ? ?/sec    1.00     46.1±1.15µs        ? ?/sec    1.19     
54.9±0.21µs        ? ?/sec    1.20     55.4±2.17µs        ? ?/sec
   filter i32 (kept 1/2)                                       1.02     
39.3±0.08µs        ? ?/sec    1.00     38.6±0.42µs        ? ?/sec    1.34     
51.9±0.25µs        ? ?/sec    1.33     51.5±0.68µs        ? ?/sec
   filter optimize (kept 1/2)                                  1.00     
44.3±0.07µs        ? ?/sec    1.03     45.7±1.05µs        ? ?/sec    1.14     
50.7±0.38µs        ? ?/sec    1.14     50.6±0.83µs        ? ?/sec
   filter single record batch                                  1.00     
44.5±0.76µs        ? ?/sec    1.01     45.0±0.72µs        ? ?/sec    1.15     
51.2±0.25µs        ? ?/sec    1.20     53.3±0.90µs        ? ?/sec
   filter u8 (kept 1/2)                                        1.00     
43.4±0.22µs        ? ?/sec    1.02     44.1±0.82µs        ? ?/sec    1.19     
51.6±0.60µs        ? ?/sec    1.22     53.0±0.94µs        ? ?/sec
   ```
   
   </details>
   
   It seems like these benchmark are rather noisy and I wonder if instead of 
wallclock time we should measure instruction counts or something like that.
   
   That said, I don't think any of them really regress w/ this PR and a few 
have substantial improvements, so I would :shipit: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to