openinx commented on a change in pull request #1346:
URL: https://github.com/apache/iceberg/pull/1346#discussion_r472011890



##########
File path: flink/src/main/java/org/apache/iceberg/flink/FlinkSchemaUtil.java
##########
@@ -98,4 +102,22 @@ public static TableSchema toSchema(RowType rowType) {
     }
     return builder.build();
   }
+
+  /**
+   * Prune columns from a {@link Schema} using a projected fields.
+   *
+   * @param schema a Schema
+   * @param projectedFields projected fields from Flink
+   * @return a Schema corresponding to the Flink projection
+   * @throws IllegalArgumentException if the Flink type does not match the 
Schema
+   */
+  public static Schema pruneWithoutReordering(Schema schema, List<String> 
projectedFields) {
+    if (projectedFields == null) {
+      return schema;
+    }
+
+    Map<String, Integer> indexByName = TypeUtil.indexByName(schema.asStruct());
+    Set<Integer> projectedIds = 
projectedFields.stream().map(indexByName::get).collect(Collectors.toSet());
+    return TypeUtil.select(schema, projectedIds);

Review comment:
       Continue with the question from 
[here](https://github.com/apache/iceberg/pull/1293#discussion_r469938063). If 
we could produce a  ordered & projected schema in this method (Saying if this 
method is `pruneWithReordering`), then seems we don't have to convert the read 
RowData to the correct order 
[here](https://github.com/apache/iceberg/pull/1346/files#diff-9e6ac35840fe0e9f8bacbe12c574c4eaR64)
 ? 
   
   I'd prefer to use the correct projected schema to read the target RowData if 
possible, rather than reading RowData in a disordered schema and then order 
them in an iterator transformation.  Because this is in the critical read path 
and an extra RowData transformation will cost more resources , also make the 
codes hard to follow. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to