delamarch3 commented on PR #6691: URL: https://github.com/apache/arrow-rs/pull/6691#issuecomment-2466246617
I've run the `filter_kernel` benchmark I added for the run array in https://github.com/apache/arrow-rs/pull/6706 with the different approaches, here are the results I get: ```rust for pred in filter_values .iter() .skip(start as usize) .take((end - start) as usize) { count += R::Native::from(pred); keep |= pred } ``` ```text Benchmarking filter run array (kept 1/2): Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 52.1s, or reduce sample count to 10. filter run array (kept 1/2) time: [542.98 ms 549.50 ms 556.59 ms] Found 10 outliers among 100 measurements (10.00%) 7 (7.00%) high mild 3 (3.00%) high severe Benchmarking filter run array high selectivity (kept 1023/1024): Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 54.3s, or reduce sample count to 10. Benchmarking filter run array high selectivity (kept 1023/1024): Collecting 100 samples in estimated 54.256 s (100 iterations filter run array high selectivity (kept 1023/1024) time: [550.25 ms 555.80 ms 561.74 ms] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild Benchmarking filter run array low selectivity (kept 1/1024): Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 53.5s, or reduce sample count to 10. filter run array low selectivity (kept 1/1024) time: [536.14 ms 540.44 ms 545.14 ms] Found 11 outliers among 100 measurements (11.00%) 6 (6.00%) high mild 5 (5.00%) high severe ``` ```rust for _ in start..end { if let Some(pred) = preds.next() { count += R::Native::from(pred); keep |= pred } } ``` ```text filter run array (kept 1/2) time: [598.70 µs 601.93 µs 605.25 µs] change: [-99.892% -99.890% -99.889%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe Benchmarking filter run array high selectivity (kept 1023/1024): Collecting 100 samples in estimated 6.0573 s (15k iterations filter run array high selectivity (kept 1023/1024) time: [386.55 µs 388.17 µs 389.91 µs] change: [-99.931% -99.930% -99.929%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe filter run array low selectivity (kept 1/1024) time: [239.93 µs 240.46 µs 241.04 µs] change: [-99.956% -99.955% -99.955%] (p = 0.00 < 0.05) Performance has improved. Found 12 outliers among 100 measurements (12.00%) 6 (6.00%) high mild 6 (6.00%) high severe ``` These two are similar but after running a few times the low selectivity benchmark seems slightly faster in this one ```rust end -= end.saturating_sub(filter_values.len() as u64); for pred in (start..end).map(|i| unsafe { filter_values.value_unchecked(i as usize) }) { count += R::Native::from(pred); keep |= pred } ``` ```text filter run array (kept 1/2) time: [581.12 µs 584.01 µs 586.90 µs] change: [-2.5195% -1.1178% +0.1036%] (p = 0.11 > 0.05) No change in performance detected. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe Benchmarking filter run array high selectivity (kept 1023/1024): Collecting 100 samples in estimated 5.5900 s (15k iterations filter run array high selectivity (kept 1023/1024) time: [359.79 µs 361.40 µs 363.47 µs] change: [-7.7904% -5.5816% -3.1503%] (p = 0.00 < 0.05) Performance has improved. Found 14 outliers among 100 measurements (14.00%) 3 (3.00%) high mild 11 (11.00%) high severe filter run array low selectivity (kept 1/1024) time: [209.87 µs 210.45 µs 211.09 µs] change: [-13.950% -13.255% -12.616%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
