pitrou commented on issue #43057:
URL: https://github.com/apache/arrow/issues/43057#issuecomment-2497980432

   > It looks like the `PageReader` API doesn't provide a way to parallelise 
reading though, you can only iterate over data pages sequentially, so I don't 
think this is a concern:
   
   It may not be obvious how to use it with other Parquet C++ APIs, but the 
OffsetIndex conceptually allows direct access to individual pages. So, ideally 
at least, and hopefully in the future, it will be possible to access individual 
data pages from a column in a non-sequential fashion. (cc @wgtmac @mapleFU )
   
https://github.com/apache/arrow/blob/71389f845ef5f2e71dfa566f0ab4bb2988f88a8f/cpp/src/parquet/page_index.h#L119-L132
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to