vertexclique edited a comment on pull request #8664:
URL: https://github.com/apache/arrow/pull/8664#issuecomment-733276442
Ok, so crucial operations are improved. I've updated the benchmarks. Since
benchmarks with 512 elements fit most caches it creates unstable benchmarks.
Kernels only can get better after this PR got merged and they are rewritten
with parallel iterators. Feel free to benchmark this PR.
Things to do after this pr:
* Other kernels can be improved by different prs
* Some code can be removed. e.g. `mask_from_u64`. These are not needed and
will improve performance.
* Most of the operations are not operating on the larger data so benchmarks
are kind of not reliable, these also can be changed by yet another PR.
* We can start writing parallel code.
@nevi-me @jorgecarleitao @alamb
Because bit ops performance has been improved these benchmarks have been
improved significantly. The rest of the improvements are in the pr description:
```
Running
/home/vertexclique/projects/arrow/rust/target/release/deps/comparison_kernels-f8939ec12975f45e
eq Float32 time: [36.413 us 36.429 us 36.447 us]
change: [-95.357% -95.353% -95.349%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
eq scalar Float32 time: [33.540 us 33.551 us 33.562 us]
change: [-94.698% -94.690% -94.684%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
neq Float32 time: [36.756 us 36.768 us 36.781 us]
change: [-94.049% -94.047% -94.045%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
neq scalar Float32 time: [30.706 us 30.718 us 30.734 us]
change: [-95.092% -95.089% -95.086%] (p = 0.00 <
0.05)
Performance has improved.
lt Float32 time: [36.489 us 36.498 us 36.509 us]
change: [-94.514% -94.504% -94.495%] (p = 0.00 <
0.05)
Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
lt scalar Float32 time: [30.855 us 30.871 us 30.892 us]
change: [-94.996% -94.993% -94.990%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
lt_eq Float32 time: [36.478 us 36.492 us 36.508 us]
change: [-94.700% -94.697% -94.695%] (p = 0.00 <
0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
lt_eq scalar Float32 time: [32.641 us 32.653 us 32.668 us]
change: [-95.305% -95.299% -95.292%] (p = 0.00 <
0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
gt Float32 time: [36.647 us 36.658 us 36.672 us]
change: [-94.121% -94.119% -94.116%] (p = 0.00 <
0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
gt scalar Float32 time: [36.541 us 36.562 us 36.583 us]
change: [-94.288% -94.280% -94.273%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild
gt_eq Float32 time: [36.510 us 36.524 us 36.540 us]
change: [-95.402% -95.396% -95.390%] (p = 0.00 <
0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
gt_eq scalar Float32 time: [32.121 us 32.141 us 32.163 us]
change: [-94.877% -94.875% -94.873%] (p = 0.00 <
0.05)
Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
12 (12.00%) low mild
1 (1.00%) high mild
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]