Re: [PR] Preserve null dictionary values in `interleave` and `concat` kernels [arrow-rs]

via GitHub Mon, 17 Feb 2025 02:07:03 -0800


kawadakk commented on PR #7144:
URL: https://github.com/apache/arrow-rs/pull/7144#issuecomment-2662529463


   @tustvold I tried running `cargo bench --bench concatenate_kernel`, but the 
results fluctuate between "regressed" and "improved" due to a measurement noise.
   
   ```
   concat str_dict 1024    time:   [1.6693 µs 1.6702 µs 1.6714 µs]
                           change: [-2.0002% -1.8167% -1.5443%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     1 (1.00%) high mild
     5 (5.00%) high severe
   
   concat str_dict_sparse 1024
                           time:   [4.6988 µs 4.7002 µs 4.7017 µs]
                           change: [+1.3933% +1.4438% +1.4888%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   
   concat str nulls 1024   time:   [3.8924 µs 3.8939 µs 3.8956 µs]
                           change: [-0.4394% -0.3634% -0.2873%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 7 outliers among 100 measurements (7.00%)
     1 (1.00%) low severe
     3 (3.00%) low mild
     1 (1.00%) high mild
     2 (2.00%) high severe
   ```
   
   Theoretically, this can improve performance because 
`merge_dictionary_values` no longer calls `DictionaryArray::logical_nulls`, 
which could build a new boolean array. On the other hand, 
`merge_dictionary_values` will no longer skip over null values, but I think it 
would be pathological for a dictionary array to contain many null values.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Preserve null dictionary values in `interleave` and `concat` kernels [arrow-rs]

Reply via email to