cshuo commented on code in PR #18723:
URL: https://github.com/apache/hudi/pull/18723#discussion_r3309088871


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/DataTypeUtils.java:
##########
@@ -120,6 +124,49 @@ public static int[] projectOrdinals(RowType rowType, 
RowType producedRowType) {
     return 
producedRowType.getFieldNames().stream().mapToInt(fieldNames::indexOf).toArray();
   }
 
+  /**
+   * Creates the hoodie required schema for a projected Flink row type.
+   *
+   * <p>When a requested field is a hoodie specific logical type in {@code 
tableSchema}, this method
+   * reuses the table schema field to preserve logical metadata that cannot be 
recovered from Flink
+   * {@link RowType}, for example VARIANT semantics or VECTOR element type and 
dimension. Other
+   * fields are taken from the schema converted from {@code requiredRowType}, 
so readers use the
+   * projected field schema and can still keep missing required columns in the 
requested schema for
+   * later schema-evolution/default-value handling.
+   *
+   * @param tableSchema     source table schema with hoodie logical type 
metadata
+   * @param requiredRowType projected Flink row type requested by the query
+   * @return required hoodie schema matching the projected field order
+   */
+  public static HoodieSchema createRequiredSchema(HoodieSchema tableSchema, 
RowType requiredRowType) {
+    HoodieSchema fallbackRequiredSchema = 
HoodieSchemaConverter.convertToSchema(requiredRowType);

Review Comment:
   I avoided `generateProjectionSchema` here because it would copy every 
projected field directly from `tableSchema`. For ordinary fields we still want 
the schema derived from `requiredRowType`, so the reader keeps the projected 
query shape and schema-evolution/default-value handling. We only reuse 
`tableSchema` fields for Hoodie logical types whose metadata cannot be 
reconstructed from RowType, currently VECTOR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to