wgtmac commented on code in PR #36510:
URL: https://github.com/apache/arrow/pull/36510#discussion_r1267003756
##########
cpp/src/parquet/file_reader.h:
##########
@@ -44,7 +44,8 @@ class PARQUET_EXPORT RowGroupReader {
// An implementation of the Contents class is defined in the .cc file
struct Contents {
virtual ~Contents() {}
- virtual std::unique_ptr<PageReader> GetColumnPageReader(int i) = 0;
+ virtual std::unique_ptr<PageReader> GetColumnPageReader(
Review Comment:
The Prebuffer() only requires to read dictionary and data pages. It does not
need to read ColumnMetaData and cannot read any customized index page. Based on
that assumption, the current implementation can satisfy its need. However, if
we expose the logic to external user, it is not easy to make it clear what is
the total range of a column chunk (even the specs does not provide a good
approach to get it right). I know you need that information to provide hints to
estimate the read boundary of a column chunk. You can simply copy the code into
your own business logic. It would be way easier to maintain on your own. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]