wgtmac commented on code in PR #36510:
URL: https://github.com/apache/arrow/pull/36510#discussion_r1264696705


##########
cpp/src/parquet/file_reader.h:
##########
@@ -44,7 +44,8 @@ class PARQUET_EXPORT RowGroupReader {
   // An implementation of the Contents class is defined in the .cc file
   struct Contents {
     virtual ~Contents() {}
-    virtual std::unique_ptr<PageReader> GetColumnPageReader(int i) = 0;
+    virtual std::unique_ptr<PageReader> GetColumnPageReader(

Review Comment:
   It is still not a good practice to add a new ColumnReaderProperties 
parameter here. We should get all reader properties (whatever file level or 
column level) via the `properties_` object. Could you simply copy the behavior 
of `WriterProperties` and `ColumnProperties`, then do something similar to 
`ReaderProperties` and `ColumnReaderProperties`? The gap is that 
ReaderProperties does not provide a builder pattern and directly implements 
`set_xxx()` functions instead. We can add 
`ReaderProperties::set_buffer_size(const std::string& path, int64_t size)` (and 
the schema::ColumnPath variant), and put the size value into the internal 
`std::map<std::string, ColumnReaderProperties>`.



##########
cpp/src/parquet/file_reader.h:
##########
@@ -189,6 +190,9 @@ class PARQUET_EXPORT ParquetFileReader {
   ::arrow::Future<> WhenBuffered(const std::vector<int>& row_groups,
                                  const std::vector<int>& column_indices) const;
 
+  /// Return the range of the specified column chunk.
+  ::arrow::io::ReadRange GetColumnChunkRange(int row_group_index, int 
column_index);

Review Comment:
   I'd rather not exposing this function externally.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to