Dandandan commented on issue #1652: URL: https://github.com/apache/arrow-datafusion/issues/1652#issuecomment-1323896497
> > But maybe page pruning works in some cases, that would be great. > > FWIW we support [late materialization](https://docs.cloudera.com/cdw-runtime/cloud/impala-reference/topics/impala-lazy-materialization.html) now, in addition to row group and page pruning, and which doesn't rely on aggregate statistics. It instead evaluates predicates on the subset of the columns needed by the predicate, and then uses the result to conditionally materialize the other projected columns. This isn't always advantageous, e.g. for predicates that don't eliminate large numbers of rows, we need to develop better heuristics / bailing logic, but where it works the benefits can be huge. Yes, I realized that, but also seeing some tpc-h generated data and filters, I would be (pleasantly) surprised if it can prune out many pages. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
