[GitHub] [arrow] LouisClt commented on a diff in pull request #15184: GH-15185: [C++][Parquet] Improve documentation for Parquet Reader column_indices

GitBox Thu, 05 Jan 2023 00:47:49 -0800


LouisClt commented on code in PR #15184:
URL: https://github.com/apache/arrow/pull/15184#discussion_r1062233463



##########
cpp/src/parquet/arrow/reader.h:
##########
@@ -225,7 +226,19 @@ class PARQUET_EXPORT FileReader {
 
   /// \brief Read the given columns into a Table
   ///
-  /// The indicated column indices are relative to the schema
+  /// The indicated column indices are relative to the internal representation
+  /// of the parquet table. For instance :
+  /// 0 foo.bar
+  ///       foo.bar.baz           0
+  ///       foo.bar.baz2          1
+  ///   foo.qux                   2
+  /// 1 foo2                      3
+  /// 2 foo3                      4
+  ///
+  /// i=0 will read foo.bar.baz, i=1 will read only foo.bar.baz2 and so on.
+  /// To retrive all the indices corresponding to the upper level arrow schema,
+  /// one can use manifest().schema_fields and get all the indices of the 
leaves
+  /// for each particular upper level field index.

Review Comment:
   Yes, your version seems to be more readable than mine.
   I also updated the comments for the "num_columns()" method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] LouisClt commented on a diff in pull request #15184: GH-15185: [C++][Parquet] Improve documentation for Parquet Reader column_indices

Reply via email to