vertexclique edited a comment on pull request #8664: URL: https://github.com/apache/arrow/pull/8664#issuecomment-733276442
Ok, so crucial operations are improved. I've updated the benchmarks. Since benchmarks with 512 elements fit most caches it creates unstable benchmarks. Kernels only can get better after this PR got merged and they are rewritten with parallel iterators. Feel free to benchmark this PR. Things to do after this pr: * Other kernels can be improved by different prs * Some code can be removed. e.g. `mask_from_u64`. These are not needed and will improve performance. * Most of the operations are not operating on the larger data so benchmarks are kind of not reliable, these also can be changed by yet another PR. * We can start writing parallel code. @nevi-me @jorgecarleitao @alamb Because bit ops performance has been improved these benchmarks have been improved significantly. The rest of the improvements are in the pr description: ``` Running /home/vertexclique/projects/arrow/rust/target/release/deps/comparison_kernels-f8939ec12975f45e eq Float32 time: [36.413 us 36.429 us 36.447 us] change: [-95.357% -95.353% -95.349%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild eq scalar Float32 time: [33.540 us 33.551 us 33.562 us] change: [-94.698% -94.690% -94.684%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe neq Float32 time: [36.756 us 36.768 us 36.781 us] change: [-94.049% -94.047% -94.045%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe neq scalar Float32 time: [30.706 us 30.718 us 30.734 us] change: [-95.092% -95.089% -95.086%] (p = 0.00 < 0.05) Performance has improved. lt Float32 time: [36.489 us 36.498 us 36.509 us] change: [-94.514% -94.504% -94.495%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe lt scalar Float32 time: [30.855 us 30.871 us 30.892 us] change: [-94.996% -94.993% -94.990%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild lt_eq Float32 time: [36.478 us 36.492 us 36.508 us] change: [-94.700% -94.697% -94.695%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe lt_eq scalar Float32 time: [32.641 us 32.653 us 32.668 us] change: [-95.305% -95.299% -95.292%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe gt Float32 time: [36.647 us 36.658 us 36.672 us] change: [-94.121% -94.119% -94.116%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe gt scalar Float32 time: [36.541 us 36.562 us 36.583 us] change: [-94.288% -94.280% -94.273%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild gt_eq Float32 time: [36.510 us 36.524 us 36.540 us] change: [-95.402% -95.396% -95.390%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild gt_eq scalar Float32 time: [32.121 us 32.141 us 32.163 us] change: [-94.877% -94.875% -94.873%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 12 (12.00%) low mild 1 (1.00%) high mild ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org