[GitHub] [arrow] wjones127 commented on a diff in pull request #15184: GH-15185: [C++][Parquet] Improve documentation for Parquet Reader column_indices

GitBox Wed, 04 Jan 2023 12:50:41 -0800


wjones127 commented on code in PR #15184:
URL: https://github.com/apache/arrow/pull/15184#discussion_r1061870977



##########
cpp/src/parquet/arrow/reader.h:
##########
@@ -225,7 +226,19 @@ class PARQUET_EXPORT FileReader {
 
   /// \brief Read the given columns into a Table
   ///
-  /// The indicated column indices are relative to the schema
+  /// The indicated column indices are relative to the internal representation
+  /// of the parquet table. For instance :
+  /// 0 foo.bar
+  ///       foo.bar.baz           0
+  ///       foo.bar.baz2          1
+  ///   foo.qux                   2
+  /// 1 foo2                      3
+  /// 2 foo3                      4
+  ///
+  /// i=0 will read foo.bar.baz, i=1 will read only foo.bar.baz2 and so on.
+  /// To retrive all the indices corresponding to the upper level arrow schema,
+  /// one can use manifest().schema_fields and get all the indices of the 
leaves
+  /// for each particular upper level field index.

Review Comment:
   What do you think of this?
   
   ```suggestion
     /// Only leaf fields have indices; foo itself doesn't have an index.
     /// To get the index for a particular leaf field, one can use
     /// manifest().schema_fields to get the top level fields, and then walk the
     /// tree to identify the relevant leaf fields and access its column_index.
     /// To get the total number of leaf fields, use FileMetadata.num_columns().
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] wjones127 commented on a diff in pull request #15184: GH-15185: [C++][Parquet] Improve documentation for Parquet Reader column_indices

Reply via email to