Re: [PR] Core: Interface based DataFile reader and writer API - PoC [iceberg]

via GitHub Wed, 14 Jan 2026 15:41:22 -0800


rdblue commented on code in PR #12298:
URL: https://github.com/apache/iceberg/pull/12298#discussion_r2692451782



##########
parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java:
##########
@@ -1241,6 +1240,50 @@ public ReaderFunction withSchema(Schema expectedSchema) {
       }
     }
 
+    public interface BatchReaderFunction {
+      Function<MessageType, VectorizedReader<?>> apply();
+
+      default BatchReaderFunction withSchema(Schema schema) {
+        return this;
+      }
+    }
+
+    private static class UnaryBatchReaderFunction implements 
BatchReaderFunction {
+      private final Function<MessageType, VectorizedReader<?>> readerFunc;
+
+      UnaryBatchReaderFunction(Function<MessageType, VectorizedReader<?>> 
readerFunc) {
+        this.readerFunc = readerFunc;
+      }
+
+      @Override
+      public Function<MessageType, VectorizedReader<?>> apply() {
+        return readerFunc;
+      }
+    }
+
+    private static class BinaryBatchReaderFunction implements 
BatchReaderFunction {
+      private final BiFunction<Schema, MessageType, VectorizedReader<?>> 
readerFuncWithSchema;
+      private Schema schema;
+
+      BinaryBatchReaderFunction(
+          BiFunction<Schema, MessageType, VectorizedReader<?>> 
readerFuncWithSchema) {
+        this.readerFuncWithSchema = readerFuncWithSchema;
+      }
+
+      @Override
+      public Function<MessageType, VectorizedReader<?>> apply() {
+        Preconditions.checkArgument(
+            schema != null, "Schema must be set for 2-argument reader 
function");

Review Comment:
   Ah, I see that this is actually just copying what was already there for 
row-based reads. I think that's fine, but we should probably avoid copying the 
practice in the batch read path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Core: Interface based DataFile reader and writer API - PoC [iceberg]

Reply via email to