askoa opened a new pull request, #3705:
URL: https://github.com/apache/arrow-rs/pull/3705
# Which issue does this PR close?
Closes #3701
Part of #3520
# Rationale for this change
See issue description #3701.
# What changes are included in this PR?
Update the `take_run` function to do the below
1 Take physical_indices for the given logical indices.
2 Run encode the physical_indices while keeping track of physical indices
that needs to be taken.
3 take values from run_array.values based on physical indices from the step
2.
4 Build a new run array using run_ends from step 2 and values from step 3.
The performance of benchmark `primitive_take_run` has improved by approx 15%
<details><summary>Benchmark result</summary>
```
Primitive_run_take/(run_array_len:512, physical_array_len:64, take_len:512)
time: [14.263 µs 14.289 µs 14.314 µs]
change: [-16.324% -15.022% -13.604%] (p = 0.00 <
0.05)
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) high mild
4 (4.00%) high severe
primitive_run_take/(run_array_len:512, physical_array_len:128, take_len:512)
time: [14.299 µs 14.320 µs 14.344 µs]
change: [-17.510% -16.083% -14.660%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
primitive_run_take/(run_array_len:1024, physical_array_len:256, take_len:512)
time: [14.672 µs 14.689 µs 14.706 µs]
change: [-17.290% -15.772% -14.314%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) high mild
6 (6.00%) high severe
primitive_run_take/(run_array_len:1024, physical_array_len:256,
take_len:1024)
time: [29.943 µs 29.972 µs 30.002 µs]
change: [-18.529% -17.130% -15.819%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
7 (7.00%) high severe
primitive_run_take/(run_array_len:2048, physical_array_len:512, take_len:512)
time: [15.864 µs 15.896 µs 15.928 µs]
change: [-15.176% -13.517% -11.892%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
primitive_run_take/(run_array_len:2048, physical_array_len:512,
take_len:1024)
time: [34.206 µs 34.275 µs 34.347 µs]
change: [-15.092% -13.950% -12.743%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
primitive_run_take/(run_array_len:4096, physical_array_len:1024,
take_len:512)
time: [18.066 µs 18.094 µs 18.121 µs]
change: [-14.902% -13.842% -12.604%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) high mild
5 (5.00%) high severe
primitive_run_take/(run_array_len:4096, physical_array_len:1024,
take_len:1024)
time: [37.529 µs 37.571 µs 37.612 µs]
change: [-17.016% -15.749% -14.471%] (p = 0.00 <
0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
6 (6.00%) high severe
```
</details>
# Are there any user-facing changes?
No
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]