rdblue commented on code in PR #12298:
URL: https://github.com/apache/iceberg/pull/12298#discussion_r2692451782
##########
parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java:
##########
@@ -1241,6 +1240,50 @@ public ReaderFunction withSchema(Schema expectedSchema) {
}
}
+ public interface BatchReaderFunction {
+ Function<MessageType, VectorizedReader<?>> apply();
+
+ default BatchReaderFunction withSchema(Schema schema) {
+ return this;
+ }
+ }
+
+ private static class UnaryBatchReaderFunction implements
BatchReaderFunction {
+ private final Function<MessageType, VectorizedReader<?>> readerFunc;
+
+ UnaryBatchReaderFunction(Function<MessageType, VectorizedReader<?>>
readerFunc) {
+ this.readerFunc = readerFunc;
+ }
+
+ @Override
+ public Function<MessageType, VectorizedReader<?>> apply() {
+ return readerFunc;
+ }
+ }
+
+ private static class BinaryBatchReaderFunction implements
BatchReaderFunction {
+ private final BiFunction<Schema, MessageType, VectorizedReader<?>>
readerFuncWithSchema;
+ private Schema schema;
+
+ BinaryBatchReaderFunction(
+ BiFunction<Schema, MessageType, VectorizedReader<?>>
readerFuncWithSchema) {
+ this.readerFuncWithSchema = readerFuncWithSchema;
+ }
+
+ @Override
+ public Function<MessageType, VectorizedReader<?>> apply() {
+ Preconditions.checkArgument(
+ schema != null, "Schema must be set for 2-argument reader
function");
Review Comment:
Ah, I see that this is actually just copying what was already there for
row-based reads. I think that's fine, but we should probably avoid copying the
practice in the batch read path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]