JingsongLi commented on a change in pull request #1346:
URL: https://github.com/apache/iceberg/pull/1346#discussion_r472026626
##########
File path: flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java
##########
@@ -98,4 +102,22 @@ public static TableSchema toSchema(RowType rowType) {
}
return builder.build();
}
+
+ /**
+ * Prune columns from a {@link Schema} using a projected fields.
+ *
+ * @param schema a Schema
+ * @param projectedFields projected fields from Flink
+ * @return a Schema corresponding to the Flink projection
+ * @throws IllegalArgumentException if the Flink type does not match the
Schema
+ */
+ public static Schema pruneWithoutReordering(Schema schema, List<String>
projectedFields) {
+ if (projectedFields == null) {
+ return schema;
+ }
+
+ Map<String, Integer> indexByName = TypeUtil.indexByName(schema.asStruct());
+ Set<Integer> projectedIds =
projectedFields.stream().map(indexByName::get).collect(Collectors.toSet());
+ return TypeUtil.select(schema, projectedIds);
Review comment:
> I'd prefer to use the correct projected schema to read the target
RowData if possible
This is what I want too, but I'm afraid the current format readers do not
have this capability. You can take a look to `AvroSchemaWithTypeVisitor`, the
readers order is according to file schema instead of Flink projected/expected
schema.
> Because this is in the critical read path and an extra RowData
transformation will cost more resources
The performance is OK, because we just use a lazy projection in
`ProjectionRowData`, Unnecessary projections are omitted.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]