Ted-Jiang commented on code in PR #2335:
URL: https://github.com/apache/arrow-rs/pull/2335#discussion_r939545227
##########
parquet/src/arrow/async_reader.rs:
##########
@@ -478,9 +550,56 @@ struct InMemoryRowGroup {
row_count: usize,
}
+impl InMemoryRowGroup {
+ /// Fetches the necessary column data into memory
+ async fn fetch<T: AsyncFileReader + Send>(
+ &mut self,
+ input: &mut T,
+ metadata: &RowGroupMetaData,
+ projection: &ProjectionMask,
+ _selection: Option<&RowSelection>,
+ ) -> Result<()> {
+ // TODO: Use OffsetIndex and selection to prune pages
Review Comment:
👍 this avoid huge IO work in some situation make pageIndex more useful !
I think it needs takes a lot of testing to decide when use random skip reads
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]