tyrelr commented on pull request #9304:
URL: https://github.com/apache/arrow/pull/9304#issuecomment-766177143


   The 10% impacts are:
   ```
   critcmp master-67d0c2e3 bool-38d4e395 -t 10
   group                        bool-38d4e395                          
master-67d0c2e3
   -----                        -------------                          
---------------
   add_nulls_512                1.30    343.7±0.62ns        ? B/sec    1.00    
264.4±1.80ns        ? B/sec
   cast float32 to int32 512    1.13      3.2±0.01µs        ? B/sec    1.00     
 2.8±0.01µs        ? B/sec
   concat str nulls 1024        1.10     21.1±0.06µs        ? B/sec    1.00     
19.1±0.11µs        ? B/sec
   eq Float32                   1.00     89.0±0.10µs        ? B/sec    1.42    
126.0±0.28µs        ? B/sec
   eq scalar Float32            1.11     78.0±0.12µs        ? B/sec    1.00     
70.2±0.11µs        ? B/sec
   gt scalar Float32            1.36     71.8±0.15µs        ? B/sec    1.00     
52.7±0.09µs        ? B/sec
   gt_eq Float32                1.00     75.5±0.12µs        ? B/sec    1.65    
124.6±0.22µs        ? B/sec
   lt scalar Float32            1.15     71.1±0.15µs        ? B/sec    1.00     
62.0±0.51µs        ? B/sec
   lt_eq Float32                1.00     74.3±0.11µs        ? B/sec    1.66    
123.5±0.28µs        ? B/sec
   lt_eq scalar Float32         1.00     61.5±0.56µs        ? B/sec    1.45     
88.9±0.22µs        ? B/sec
   multiply 512                 1.00    262.0±0.38ns        ? B/sec    1.32    
345.1±4.37ns        ? B/sec
   neq scalar Float32           1.20     78.6±0.81µs        ? B/sec    1.00     
65.5±0.65µs        ? B/sec
   or                           1.00   1590.0±2.45ns        ? B/sec    1.11  
1760.7±14.78ns        ? B/sec
   subtract 512                 1.43    369.9±5.24ns        ? B/sec    1.00    
258.4±0.40ns        ? B/sec
   take i32 1024                1.10   1871.4±3.67ns        ? B/sec    1.00   
1695.3±4.85ns        ? B/sec
   take i32 512                 1.00    928.5±1.79ns        ? B/sec    1.12  
1043.4±13.97ns        ? B/sec
   take i32 nulls 512           1.10   1074.7±1.81ns        ? B/sec    1.00    
975.5±1.60ns        ? B/sec
   ```
   So it's hard to see through the noise... but I think it looks performance 
neutral?  (I don't think take/subtract/multiply/concat/cast/add interact with 
this api at all, so those should just be noise).
   
   The only other kernel that I think might be able to take advantage of this 
might be 'take'... I believe it does bit-by-bit buffer to copy the 'taken' null 
values...  But I haven't looked at whether it actually would benefit.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to