JingsongLi commented on a change in pull request #1346:
URL: https://github.com/apache/iceberg/pull/1346#discussion_r472026626



##########
File path: flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java
##########
@@ -98,4 +102,22 @@ public static TableSchema toSchema(RowType rowType) {
     }
     return builder.build();
   }
+
+  /**
+   * Prune columns from a {@link Schema} using a projected fields.
+   *
+   * @param schema a Schema
+   * @param projectedFields projected fields from Flink
+   * @return a Schema corresponding to the Flink projection
+   * @throws IllegalArgumentException if the Flink type does not match the 
Schema
+   */
+  public static Schema pruneWithoutReordering(Schema schema, List<String> 
projectedFields) {
+    if (projectedFields == null) {
+      return schema;
+    }
+
+    Map<String, Integer> indexByName = TypeUtil.indexByName(schema.asStruct());
+    Set<Integer> projectedIds = 
projectedFields.stream().map(indexByName::get).collect(Collectors.toSet());
+    return TypeUtil.select(schema, projectedIds);

Review comment:
       > I'd prefer to use the correct projected schema to read the target 
RowData if possible
   
   This is what I want too, but I'm afraid the current format readers do not 
have this capability. You can take a look to `AvroSchemaWithTypeVisitor`, the 
readers order is according to file schema instead of Flink projected/expected 
schema.
   
   > Because this is in the critical read path and an extra RowData 
transformation will cost more resources
   
   The performance is OK, because we just use a lazy projection in 
`ProjectionRowData`, Unnecessary projections are omitted.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to