jp0317 commented on code in PR #36510: URL: https://github.com/apache/arrow/pull/36510#discussion_r1264734613
########## cpp/src/parquet/file_reader.h: ########## @@ -44,7 +44,8 @@ class PARQUET_EXPORT RowGroupReader { // An implementation of the Contents class is defined in the .cc file struct Contents { virtual ~Contents() {} - virtual std::unique_ptr<PageReader> GetColumnPageReader(int i) = 0; + virtual std::unique_ptr<PageReader> GetColumnPageReader( Review Comment: Thanks for the review. Regarding `GetColumnChunkRange`, is there any concern exposing it? IIUC currently users can only rely on `total_compressed_size` which reveals no offset information and may not reflect the actual chunk size . For the `ColumnReaderProperties`, given that the reader apis are all index based, maybe we can just use index (as mapleFU suggested) without involving column paths, especially a map on path strings? Initially i was trying to avoid keeping such a map in `ReaderProperties`, and more importantly, i feel it makes sense to implement this customized buffer size as "column chunk specific": different column chunks from the same column can have different buffer size. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org