alamb commented on issue #10140: URL: https://github.com/apache/arrow-rs/issues/10140#issuecomment-4783367298
> However, after checking the current DataFusion paths, I think this is hard to exercise with the existing TPC-DS / ClickBench SQL benchmarks. Those benchmarks can exercise Parquet row filtering, but they do not appear to construct a bitmap-backed `RowSelection`. The row-filter path still goes through `RowSelection::from_filters`, and page/access-plan selections generally use `RowSelection::from(Vec<RowSelector>)`. So I was not able to make TPC-DS or ClickBench naturally hit the core `RowSelection::from_boolean_buffer` / `RowSelectionInner::Mask -> new_mask_from_buffer` path. I think this PR could help improve performance with filter pushdown on (specifically avoid having to re-create row selections) - https://github.com/apache/arrow-rs/issues/8844 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
