vustef commented on code in PR #7307:
URL: https://github.com/apache/arrow-rs/pull/7307#discussion_r2444641611


##########
parquet/src/arrow/array_reader/builder.rs:
##########


Review Comment:
   Thrilled to see this!
   Please let me know if I can help in any way. I can make it my top priority 
to work on this, as we need to make use of it in the next few weeks.
   Our use-case is to leverage this from iceberg-rust, which uses 
`ParquetRecordBatchStreamBuilder`. The API seems to work for that, but I 
understand from other comments that it may not be the most desirable one - 
happy to help either with research/proposal or with the implementation of the 
chosen option.



##########
parquet/src/file/metadata/mod.rs:
##########
@@ -700,7 +708,11 @@ impl RowGroupMetaData {
     }
 
     /// Method to convert from Thrift.
-    pub fn from_thrift(schema_descr: SchemaDescPtr, mut rg: RowGroup) -> 
Result<RowGroupMetaData> {
+    pub fn from_thrift(
+        schema_descr: SchemaDescPtr,
+        mut rg: RowGroup,
+        first_row_number: Option<i64>,

Review Comment:
   @etseidl seems to have refactored quite a lot of metadata / thrift 
functionality, but I'm wondering, since this function was used directly by 
delta-rs, and is thus considered an API, shouldn't we avoid changing it, and 
rather factor out the code that does the core of conversion, and add another 
entry-point that has `first_row_number`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to