Ted-Jiang commented on code in PR #2478:
URL: https://github.com/apache/arrow-rs/pull/2478#discussion_r948663929
##########
parquet/src/file/serialized_reader.rs:
##########
@@ -623,26 +627,13 @@ impl<R: ChunkReader> PageReader for
SerializedPageReader<R> {
let page_len = front.compressed_page_size as usize;
- // TODO: Add ChunkReader get_bytes to potentially avoid
copy
- let mut buffer = Vec::with_capacity(page_len);
- let read = self
- .reader
- .get_read(front.offset as u64, page_len)?
- .read_to_end(&mut buffer)?;
-
- if read != page_len {
- return Err(eof_err!(
- "Expected to read {} bytes of page, read only {}",
- page_len,
- read
- ));
- }
+ let buffer = self.reader.get_bytes(front.offset as u64,
page_len)?;
Review Comment:
If a page is small enough like 1 mb, i guess there is no defect when using
eagerly fetch the entire column. looking forward the investigation result!👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]