rdblue commented on code in PR #12298:
URL: https://github.com/apache/iceberg/pull/12298#discussion_r2692461709


##########
parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java:
##########
@@ -1442,11 +1498,13 @@ public <D> CloseableIterable<D> build() {
         }
 
         if (batchedReaderFunc != null) {
+          Function<MessageType, VectorizedReader<?>> readBuilder =
+              batchedReaderFunc.withSchema(schema).apply();

Review Comment:
   I don't see the benefit of adding `BatchReaderFunction` and the two 
implementations that aren't exposed (although the interface is public). 
Instead, this could track the `BiFunction` and then pass a new `readerFunc` 
created here:
   
   ```java
             Function<MessageType, VectorizedReader<?>> readerFunc =
                 messageType -> batchedReaderFunc.apply(schema, messageType);
   ```
   
   This is a temporary fix, though. Because of the `Precondition` I pointed out 
above, I think the final solution is to ensure that a valid schema is always 
passed. That means that this should pass the `BiFunction` into 
`VectorizedParquetReader` instead of just the one that takes `MessageType`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to