zhuqi-lucas commented on PR #7461: URL: https://github.com/apache/arrow-rs/pull/7461#issuecomment-2844151896
> Unfortunately, even after adjusting the benchmark on this branch I still don't see major changes in #7428. > > I will look more deeply tomorrow > > ```shell > cargo bench --all-features --bench arrow_reader_row_filter -- Utf8ViewNonEmpty > ``` > > Main compared to #7428 > > ``` > arrow_reader_row_filter/Utf8ViewNonEmpty/all_columns/async > time: [4.1253 ms 4.1553 ms 4.1881 ms] > change: [-4.8097% -3.9190% -3.0939%] (p = 0.00 < 0.05) > Performance has improved. > Found 19 outliers among 100 measurements (19.00%) > 14 (14.00%) high mild > 5 (5.00%) high severe > arrow_reader_row_filter/Utf8ViewNonEmpty/all_columns/sync > time: [4.2269 ms 4.2340 ms 4.2419 ms] > change: [-1.5246% -1.1130% -0.7616%] (p = 0.00 < 0.05) > Change within noise threshold. > Found 5 outliers among 100 measurements (5.00%) > 3 (3.00%) high mild > 2 (2.00%) high severe > arrow_reader_row_filter/Utf8ViewNonEmpty/exclude_filter_column/async > time: [3.0754 ms 3.0802 ms 3.0857 ms] > change: [-3.2754% -2.7568% -2.2574%] (p = 0.00 < 0.05) > Performance has improved. > Found 11 outliers among 100 measurements (11.00%) > 6 (6.00%) high mild > 5 (5.00%) high severe > arrow_reader_row_filter/Utf8ViewNonEmpty/exclude_filter_column/sync > time: [3.0774 ms 3.0839 ms 3.0909 ms] > change: [-1.4528% -1.1133% -0.7921%] (p = 0.00 < 0.05) > Change within noise threshold. > Found 7 outliers among 100 measurements (7.00%) > 6 (6.00%) high mild > 1 (1.00%) high severe > ``` Thank you @alamb for this work, so we still need to investigate more. Is it possible that we can get a 10% data set from hit.parquet and do some benchmark from arrow-rs side. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org