alamb commented on issue #1652: URL: https://github.com/apache/arrow-datafusion/issues/1652#issuecomment-1023608032
DataFusion does implement row group pruning based on statistics (that arrow-rs creates) arrow-rs creates statistics. It will (in version after 8.0.0) preserve dictionary encoding (meaning the output of a dictionary encoded column will also be dictionary encoded) which should be a large win for low cardinality columns It does not currently create or use column indexes or bloom filters I believe that @tustvold has a plan for implementing more sophisticated predicate pushdown (aka that a filter on one column could be used to avoid decoding swaths of others) but I am not sure what the timeline on that is -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org