mingmwang commented on issue #2962: URL: https://github.com/apache/arrow-datafusion/issues/2962#issuecomment-1197770582
@alamb Regarding the parquet row group pruning, the current pruning logic covers the stats pruning which is common for any columnar storage who provides stats and can be reused. But for parquet format, it also has specific pruning like dict pruning, bloom filter pruning, those two types of pruning is not implemented yet. Maybe those two types of pruning should be part of the parquet arrow project. And in the current parquet reader implementation, I do not find a method we can use to read the dictionary page out and use it to construct a Set for filtering purpose. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
