[GitHub] [arrow-rs] tustvold commented on issue #1955: Support multi diskRanges for ChunkReader

GitBox Tue, 28 Jun 2022 01:36:47 -0700


tustvold commented on issue #1955:
URL: https://github.com/apache/arrow-rs/issues/1955#issuecomment-1168412426


   Why not just call get_read for each page instead of for the entire column 
chunk? There is no requirement for get_read to delimit column chunks, after all 
the same trait is used to read the footer, etc...
   
   Somewhat related, but something to keep in mind is how this will all work 
with `ParquetRecordBatchStream`. This does not make use of `ChunkReader`, and 
is instead push-based, needing to know the ranges to fetch up-front. It should 
just be a case of making `InMemoryColumnChunk` sparse and teaching 
`InMemoryColumnChunkReader` to read it correctly, but it is probably worth 
thinking about how this will work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] tustvold commented on issue #1955: Support multi diskRanges for ChunkReader

Reply via email to