jhorstmann opened a new pull request #8663:
URL: https://github.com/apache/arrow/pull/8663
This refactors the calculation of null counts by using the bitchunk iterator
and `count_ones` intrinsic. The performance of array creation improves by
between 3%-5%. The biggest impact is on getting a slice of an existing array,
for a slice length of 2048 the performance in a microbenchmark doubles.
Benchmark results on a Ryzen 3700U. LLVM seems to be able to vectorize the
`count_ones` intrinsic, so performance on machines with better AVX units should
be even higher.
```
Running
/home/jhorstmann/Source/github/apache/arrow/rust/target/release/deps/array_slice-d438c5aeed9bef19
Gnuplot not found, using plotters backend
array_slice 128 time: [150.68 ns 151.74 ns 153.14 ns]
change: [-11.525% -10.357% -9.3284%] (p = 0.00 <
0.05)
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
array_slice 512 time: [158.77 ns 161.03 ns 163.62 ns]
change: [-25.922% -24.956% -23.945%] (p = 0.00 <
0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
5 (5.00%) high mild
8 (8.00%) high severe
array_slice 2048 time: [170.17 ns 171.24 ns 172.60 ns]
change: [-50.388% -49.865% -49.375%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
```
```
Running
/home/jhorstmann/Source/github/apache/arrow/rust/target/release/deps/array_from_vec-8a972c208e7a6334
Gnuplot not found, using plotters backend
array_from_vec 128 time: [750.62 ns 751.69 ns 752.85 ns]
change: [-8.2512% -7.3658% -6.5917%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
array_from_vec 256 time: [1.2501 us 1.2580 us 1.2676 us]
change: [-0.9517% +0.1364% +1.1850%] (p = 0.81 >
0.05)
No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
7 (7.00%) high mild
4 (4.00%) high severe
array_from_vec 512 time: [2.1603 us 2.1643 us 2.1690 us]
change: [-2.8122% -2.1984% -1.6379%] (p = 0.00 <
0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
10 (10.00%) high mild
3 (3.00%) high severe
array_string_from_vec 128
time: [3.2196 us 3.2288 us 3.2395 us]
change: [+1.4107% +1.9839% +2.5419%] (p = 0.00 <
0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
7 (7.00%) high mild
5 (5.00%) high severe
array_string_from_vec 256
time: [4.8112 us 4.8352 us 4.8685 us]
change: [-5.2564% -3.8312% -2.7672%] (p = 0.00 <
0.05)
Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) high mild
10 (10.00%) high severe
array_string_from_vec 512
time: [8.2588 us 8.2802 us 8.3049 us]
change: [-0.5974% -0.2991% +0.0027%] (p = 0.05 >
0.05)
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
struct_array_from_vec 128
time: [4.5771 us 4.5947 us 4.6206 us]
change: [-6.6706% -5.3873% -4.2340%] (p = 0.00 <
0.05)
Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
5 (5.00%) high mild
10 (10.00%) high severe
struct_array_from_vec 256
time: [6.7617 us 6.7887 us 6.8182 us]
change: [-4.9262% -4.2419% -3.6528%] (p = 0.00 <
0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
struct_array_from_vec 512
time: [9.8926 us 9.9639 us 10.052 us]
change: [-5.4057% -3.9331% -2.5242%] (p = 0.00 <
0.05)
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe
struct_array_from_vec 1024
time: [15.812 us 15.862 us 15.922 us]
change: [-6.1966% -5.6634% -5.0338%] (p = 0.00 <
0.05)
Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
3 (3.00%) high mild
8 (8.00%) high severe
```
```
Running
/home/jhorstmann/Source/github/apache/arrow/rust/target/release/deps/builder-26beb41d8ad91167
Gnuplot not found, using plotters backend
bench_primitive time: [4.8277 ms 4.8879 ms 4.9558 ms]
thrpt: [807.14 MiB/s 818.35 MiB/s 828.55 MiB/s]
change:
time: [-5.3415% -3.4401% -1.3973%] (p = 0.00 <
0.05)
thrpt: [+1.4171% +3.5626% +5.6429%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
bench_bool time: [2.5176 ms 2.5230 ms 2.5292 ms]
thrpt: [197.69 MiB/s 198.18 MiB/s 198.60 MiB/s]
change:
time: [-19.412% -19.203% -18.988%] (p = 0.00 <
0.05)
thrpt: [+23.438% +23.767% +24.088%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]