haohuaijin commented on issue #10140: URL: https://github.com/apache/arrow-rs/issues/10140#issuecomment-4748672861
Thanks @hhhizzz, I added a targeted benchmark in [e642eb6](https://github.com/apache/arrow-rs/pull/10141/commits/e642eb6823f80c8c400e4ea8cea57190b747c74b): `parquet/benches/row_selector_boolean_buffer.rs` It uses 3M rows and covers: - selectivity: 1%, 10%, 33%, 80% - patterns: - `fragmented`: short periodic runs - `random`: deterministic random row mask - `clustered`: long runs around 8K rows, with deterministic jitter in 4K..12K ### Construction `from_boolean_buffer` is around 4ns in all cases. `from_filters` scales with the number of runs because it scans the boolean array and materializes selector/RLE state. | selectivity | pattern | `from_filters` | `from_boolean_buffer` | |---|---|---:|---:| | 1% | fragmented | 191.17 us | 3.96 ns | | 1% | clustered+jitter | 27.76 us | 3.93 ns | | 1% | random | 311.42 us | 3.93 ns | | 10% | fragmented | 2.45 ms | 3.96 ns | | 10% | clustered+jitter | 28.88 us | 3.97 ns | | 10% | random | 2.52 ms | 3.89 ns | | 33% | fragmented | 7.33 ms | 3.99 ns | | 33% | clustered+jitter | 30.73 us | 3.89 ns | | 33% | random | 5.31 ms | 3.89 ns | | 80% | fragmented | 5.08 ms | 4.01 ns | | 80% | clustered+jitter | 30.78 us | 4.02 ns | | 80% | random | 4.15 ms | 4.00 ns | ### Reader | selectivity | pattern | `from_filters` | `from_boolean_buffer` | result | |---|---|---:|---:|---| | 1% | fragmented | 1.29 ms | 2.91 ms | selectors 2.3x faster | | 1% | clustered+jitter | 146.48 us | 2.37 ms | selectors 16.2x faster | | 1% | random | 1.40 ms | 3.10 ms | selectors 2.2x faster | | 10% | fragmented | 7.89 ms | 3.31 ms | bitmap 2.4x faster | | 10% | clustered+jitter | 317.79 us | 3.07 ms | selectors 9.7x faster | | 10% | random | 9.72 ms | 3.91 ms | bitmap 2.5x faster | | 33% | fragmented | 18.93 ms | 4.11 ms | bitmap 4.6x faster | | 33% | clustered+jitter | 694.92 us | 4.01 ms | selectors 5.8x faster | | 33% | random | 16.74 ms | 5.06 ms | bitmap 3.3x faster | | 80% | fragmented | 16.55 ms | 8.06 ms | bitmap 2.1x faster | | 80% | clustered+jitter | 1.44 ms | 5.53 ms | selectors 3.8x faster | | 80% | random | 17.89 ms | 9.95 ms | bitmap 1.8x faster | <details> <summary> The original result for the benchmark </summary> ``` row_selector_boolean_buffer/construction/from_filters/fragmented/select01 time: [190.54 µs 191.17 µs 191.88 µs] change: [−2.9658% −2.6073% −2.2105%] (p = 0.00 < 0.05) Performance has improved. row_selector_boolean_buffer/construction/from_boolean_buffer/fragmented/select01 time: [3.9501 ns 3.9604 ns 3.9713 ns] change: [−1.1354% −0.7429% −0.3207%] (p = 0.00 < 0.05) Change within noise threshold. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild row_selector_boolean_buffer/construction/from_filters/clustered/select01 time: [27.613 µs 27.761 µs 27.906 µs] change: [−3.2608% −2.8172% −2.3708%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/construction/from_boolean_buffer/clustered/select01 time: [3.9146 ns 3.9297 ns 3.9459 ns] change: [−1.2690% −0.8815% −0.4872%] (p = 0.00 < 0.05) Change within noise threshold. row_selector_boolean_buffer/construction/from_filters/random/select01 time: [310.16 µs 311.42 µs 312.67 µs] change: [−3.3229% −2.6121% −1.6825%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe row_selector_boolean_buffer/construction/from_boolean_buffer/random/select01 time: [3.9072 ns 3.9329 ns 3.9707 ns] change: [−1.0385% −0.5240% +0.0919%] (p = 0.10 > 0.05) No change in performance detected. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe row_selector_boolean_buffer/construction/from_filters/fragmented/select10 time: [2.4434 ms 2.4526 ms 2.4631 ms] change: [−3.2737% −2.6016% −1.9520%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high severe row_selector_boolean_buffer/construction/from_boolean_buffer/fragmented/select10 time: [3.9378 ns 3.9607 ns 3.9877 ns] change: [−2.0970% −1.6122% −1.1208%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe row_selector_boolean_buffer/construction/from_filters/clustered/select10 time: [28.803 µs 28.876 µs 28.943 µs] change: [−3.0172% −2.7290% −2.4039%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild row_selector_boolean_buffer/construction/from_boolean_buffer/clustered/select10 time: [3.9358 ns 3.9704 ns 4.0063 ns] change: [−1.7244% −1.0824% −0.4112%] (p = 0.00 < 0.05) Change within noise threshold. Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild row_selector_boolean_buffer/construction/from_filters/random/select10 time: [2.5058 ms 2.5156 ms 2.5258 ms] change: [−4.1974% −3.4730% −2.7921%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild row_selector_boolean_buffer/construction/from_boolean_buffer/random/select10 time: [3.8772 ns 3.8889 ns 3.9010 ns] change: [−2.7339% −2.2406% −1.7470%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild row_selector_boolean_buffer/construction/from_filters/fragmented/select33 time: [7.2446 ms 7.3308 ms 7.4183 ms] change: [−0.3035% +1.1215% +2.5042%] (p = 0.14 > 0.05) No change in performance detected. row_selector_boolean_buffer/construction/from_boolean_buffer/fragmented/select33 time: [3.9699 ns 3.9851 ns 4.0015 ns] change: [−1.8229% −1.0262% −0.1737%] (p = 0.01 < 0.05) Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/construction/from_filters/clustered/select33 time: [30.631 µs 30.731 µs 30.834 µs] change: [−2.0580% −1.5520% −1.0902%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild row_selector_boolean_buffer/construction/from_boolean_buffer/clustered/select33 time: [3.8806 ns 3.8904 ns 3.9008 ns] change: [−0.4908% −0.0420% +0.3914%] (p = 0.86 > 0.05) No change in performance detected. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe row_selector_boolean_buffer/construction/from_filters/random/select33 time: [5.2892 ms 5.3114 ms 5.3341 ms] change: [−1.1525% −0.4341% +0.2174%] (p = 0.23 > 0.05) No change in performance detected. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild row_selector_boolean_buffer/construction/from_boolean_buffer/random/select33 time: [3.8813 ns 3.8932 ns 3.9066 ns] change: [−3.6729% −2.3905% −1.3280%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild row_selector_boolean_buffer/construction/from_filters/fragmented/select80 time: [4.9838 ms 5.0808 ms 5.2282 ms] change: [−1.6439% +0.5402% +3.3851%] (p = 0.74 > 0.05) No change in performance detected. Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) high mild 6 (6.00%) high severe row_selector_boolean_buffer/construction/from_boolean_buffer/fragmented/select80 time: [3.9881 ns 4.0102 ns 4.0390 ns] change: [−0.4165% +0.1343% +0.6696%] (p = 0.64 > 0.05) No change in performance detected. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe row_selector_boolean_buffer/construction/from_filters/clustered/select80 time: [30.575 µs 30.780 µs 31.060 µs] change: [+1.0644% +2.6503% +5.7064%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low mild 3 (3.00%) high mild 4 (4.00%) high severe row_selector_boolean_buffer/construction/from_boolean_buffer/clustered/select80 time: [4.0055 ns 4.0219 ns 4.0415 ns] change: [+2.2265% +2.7364% +3.2296%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high severe row_selector_boolean_buffer/construction/from_filters/random/select80 time: [4.1190 ms 4.1492 ms 4.1807 ms] change: [+1.7691% +2.7282% +3.5787%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild row_selector_boolean_buffer/construction/from_boolean_buffer/random/select80 time: [3.9825 ns 4.0047 ns 4.0330 ns] change: [+1.1842% +1.5778% +1.9859%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_filters/fragmented/select01 time: [1.2847 ms 1.2908 ms 1.2981 ms] change: [−19.686% −15.243% −10.674%] (p = 0.00 < 0.05) Performance has improved. Found 12 outliers among 100 measurements (12.00%) 10 (10.00%) high mild 2 (2.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/fragmented/select01 time: [2.8959 ms 2.9095 ms 2.9249 ms] change: [+0.8037% +1.3091% +1.9010%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) high mild 7 (7.00%) high severe row_selector_boolean_buffer/reader/from_filters/clustered/select01 time: [145.77 µs 146.48 µs 147.49 µs] change: [−0.0438% +0.5181% +1.2044%] (p = 0.13 > 0.05) No change in performance detected. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/clustered/select01 time: [2.3698 ms 2.3727 ms 2.3759 ms] change: [−0.0678% +0.1666% +0.3712%] (p = 0.15 > 0.05) No change in performance detected. Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild row_selector_boolean_buffer/reader/from_filters/random/select01 time: [1.3941 ms 1.3980 ms 1.4025 ms] change: [−17.258% −13.147% −8.9652%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/random/select01 time: [3.0979 ms 3.1027 ms 3.1079 ms] change: [−2.6101% −2.1617% −1.7131%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_filters/fragmented/select10 time: [7.8637 ms 7.8870 ms 7.9131 ms] change: [−10.553% −9.3454% −8.2330%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/fragmented/select10 time: [3.2929 ms 3.3083 ms 3.3261 ms] change: [−1.7388% −1.1558% −0.5789%] (p = 0.00 < 0.05) Change within noise threshold. Found 11 outliers among 100 measurements (11.00%) 4 (4.00%) high mild 7 (7.00%) high severe row_selector_boolean_buffer/reader/from_filters/clustered/select10 time: [316.49 µs 317.79 µs 319.30 µs] change: [−2.5147% −1.5374% −0.6017%] (p = 0.00 < 0.05) Change within noise threshold. Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/clustered/select10 time: [3.0691 ms 3.0748 ms 3.0811 ms] change: [−2.0962% −1.8598% −1.6143%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe row_selector_boolean_buffer/reader/from_filters/random/select10 time: [9.6930 ms 9.7172 ms 9.7419 ms] change: [−7.3733% −6.8121% −6.2995%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild row_selector_boolean_buffer/reader/from_boolean_buffer/random/select10 time: [3.9018 ms 3.9136 ms 3.9262 ms] change: [−1.9018% −1.5041% −1.1211%] (p = 0.00 < 0.05) Performance has improved. Found 9 outliers among 100 measurements (9.00%) 8 (8.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_filters/fragmented/select33 time: [18.840 ms 18.925 ms 19.009 ms] change: [−8.4817% −7.8775% −7.2978%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild row_selector_boolean_buffer/reader/from_boolean_buffer/fragmented/select33 time: [4.0999 ms 4.1066 ms 4.1135 ms] change: [−3.6129% −3.3282% −3.0523%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild row_selector_boolean_buffer/reader/from_filters/clustered/select33 time: [693.36 µs 694.92 µs 697.03 µs] change: [−6.0658% −5.2853% −4.5163%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/clustered/select33 time: [4.0031 ms 4.0084 ms 4.0139 ms] change: [−4.0767% −3.5339% −3.0759%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild row_selector_boolean_buffer/reader/from_filters/random/select33 time: [16.693 ms 16.737 ms 16.783 ms] change: [−10.415% −10.019% −9.6103%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) low mild 1 (1.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/random/select33 time: [5.0512 ms 5.0618 ms 5.0761 ms] change: [−3.9491% −3.6185% −3.2660%] (p = 0.00 < 0.05) Performance has improved. Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low mild 10 (10.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_filters/fragmented/select80 time: [16.493 ms 16.551 ms 16.619 ms] change: [−11.935% −11.511% −11.035%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/fragmented/select80 time: [8.0392 ms 8.0578 ms 8.0831 ms] change: [−2.8885% −2.6173% −2.2712%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_filters/clustered/select80 time: [1.4364 ms 1.4387 ms 1.4418 ms] change: [−5.7202% −4.9168% −4.1071%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 4 (4.00%) high mild 6 (6.00%) high severe row_selector_boolean_buffer/reader/from_boolean_buffer/clustered/select80 time: [5.5214 ms 5.5307 ms 5.5406 ms] change: [−4.6287% −4.2874% −3.9778%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe row_selector_boolean_buffer/reader/from_filters/random/select80 time: [17.856 ms 17.892 ms 17.929 ms] change: [−2.5559% −2.2630% −1.9673%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild row_selector_boolean_buffer/reader/from_boolean_buffer/random/select80 time: [9.9296 ms 9.9450 ms 9.9603 ms] change: [−4.4802% −3.9625% −3.5710%] (p = 0.00 < 0.05) Performance has improved. ``` </details> The results are shape-dependent: - selectors are better for clustered long-run selections; - bitmap is better for fragmented/random masks once selector count gets large; - very low selectivity can still favor selectors because skipping most rows is cheap. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
