tustvold commented on code in PR #2435:
URL: https://github.com/apache/arrow-rs/pull/2435#discussion_r944579710
##########
parquet/src/arrow/arrow_reader/mod.rs:
##########
@@ -84,10 +204,14 @@ pub trait ArrowReader {
) -> Result<Self::RecordReader>;
}
+/// Options that control how metadata is read for a parquet file
+///
+/// See [`ArrowReaderBuilder`] for how to configure how the column data
+/// is then read from the file, including projection and filter pushdown
#[derive(Debug, Clone, Default)]
pub struct ArrowReaderOptions {
skip_arrow_metadata: bool,
- selection: Option<RowSelection>,
+ page_index: bool,
Review Comment:
There is a detail worth highlighting here, this forces decoding of the page
index for all row groups, as the row group selection isn't known at the point
the metadata is read. I experimented with APIs to allow for this, but they were
very clunky, and ultimately the index information should be relatively small
and cheap to decode so I didn't think it was worth it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]