sunchao commented on issue #1191: URL: https://github.com/apache/arrow-rs/issues/1191#issuecomment-1023597525
We can also explore lazy materialization, that is, only decode & decompress (or even pay the IO cost), when a Parquet page is actually needed. I think this is especially useful when many columns are selected and only a few very selective predicates are applied on some of the columns. It in some sense is similar to page skipping based on column index, but more powerful. It'd be very nice to implement row group and page level skipping in arrow-rs also, so that engines don't have to duplicate the work. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
