kawadakk commented on PR #7144:
URL: https://github.com/apache/arrow-rs/pull/7144#issuecomment-2662529463
@tustvold I tried running `cargo bench --bench concatenate_kernel`, but the
results fluctuate between "regressed" and "improved" due to a measurement noise.
```
concat str_dict 1024 time: [1.6693 µs 1.6702 µs 1.6714 µs]
change: [-2.0002% -1.8167% -1.5443%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
concat str_dict_sparse 1024
time: [4.6988 µs 4.7002 µs 4.7017 µs]
change: [+1.3933% +1.4438% +1.4888%] (p = 0.00 <
0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
concat str nulls 1024 time: [3.8924 µs 3.8939 µs 3.8956 µs]
change: [-0.4394% -0.3634% -0.2873%] (p = 0.00 <
0.05)
Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low severe
3 (3.00%) low mild
1 (1.00%) high mild
2 (2.00%) high severe
```
Theoretically, this can improve performance because
`merge_dictionary_values` no longer calls `DictionaryArray::logical_nulls`,
which could build a new boolean array. On the other hand,
`merge_dictionary_values` will no longer skip over null values, but I think it
would be pathological for a dictionary array to contain many null values.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]