yihua commented on code in PR #17573:
URL: https://github.com/apache/hudi/pull/17573#discussion_r2628651565
##########
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaUtils.java:
##########
@@ -595,7 +674,29 @@ public static HoodieSchema getRecordKeySchema() {
* @param fields list of fields to project
* @return projected schema containing only specified fields
*/
+
public static HoodieSchema projectSchema(HoodieSchema fileSchema,
List<String> fields) {
return
HoodieSchema.fromAvroSchema(HoodieAvroUtils.projectSchema(fileSchema.toAvroSchema(),
fields));
}
+
+ /**
+ * Gets the fully-qualified Avro record name for a Hudi table.
+ * This is equivalent to {@link
AvroSchemaUtils#getAvroRecordQualifiedName(String)}
+ * but provides a HoodieSchema-context API.
+ *
+ * <p>The qualified name follows the pattern:
hoodie.{tableName}.{tableName}_record
+ * where tableName is sanitized for Avro compatibility.</p>
+ *
+ * @param tableName the Hudi table name
+ * @return the fully-qualified Avro record name (e.g.,
"hoodie.my_table.my_table_record")
+ * @throws IllegalArgumentException if tableName is null or empty
+ * @since 1.2.0
+ */
+ public static String getRecordQualifiedName(String tableName) {
+ ValidationUtils.checkArgument(tableName != null &&
!tableName.trim().isEmpty(),
+ "Table name cannot be null or empty");
+
+ // Delegate to AvroSchemaUtils
+ return AvroSchemaUtils.getAvroRecordQualifiedName(tableName);
Review Comment:
Should we move the actual logic here as the logic is not really Avro
specific?
##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieSchemaConversionUtils.scala:
##########
@@ -18,10 +18,15 @@
package org.apache.hudi
Review Comment:
This class should be under `org.apache.hudi.spark.schema` ?
##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/avro/HoodieSparkSchemaConverters.scala:
##########
@@ -102,7 +102,14 @@ object HoodieSparkSchemaConverters {
val fields = st.map { f =>
val fieldSchema = toHoodieType(f.dataType, f.nullable, f.name,
childNameSpace)
val doc = f.getComment.orNull
- HoodieSchemaField.of(f.name, fieldSchema, doc)
Review Comment:
Should we git rid of this method which does not take the `defaultVal`
argument?
##########
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaUtils.java:
##########
@@ -307,6 +325,31 @@ public static HoodieSchemaField
createNewSchemaField(String name, HoodieSchema s
return HoodieSchemaField.fromAvroField(avroField);
}
+ /**
+ * Creates a new schema field with the specified properties, including field
order.
+ * This is equivalent to HoodieAvroUtils.createNewSchemaField() but returns
HoodieSchemaField.
+ *
+ * @param name field name
+ * @param schema field schema
+ * @param doc field documentation (can be null)
+ * @param defaultValue default value (can be null)
+ * @param order field order for sorting
+ * @return new HoodieSchemaField instance
+ * @throws IllegalArgumentException if name, schema, or order is null/empty
+ * @since 1.2.0
+ */
+ public static HoodieSchemaField createNewSchemaField(String name,
HoodieSchema schema,
+ String doc, Object
defaultValue, HoodieFieldOrder order) {
+ ValidationUtils.checkArgument(name != null && !name.isEmpty(), "Field name
cannot be null or empty");
+ ValidationUtils.checkArgument(schema != null, "Field schema cannot be
null");
+ ValidationUtils.checkArgument(order != null, "Field order cannot be null");
+
+ // Delegate to HoodieAvroUtils
+ Schema.Field avroField = HoodieAvroUtils.createNewSchemaField(
+ name, schema.toAvroSchema(), doc, defaultValue, order.toAvroOrder());
+ return HoodieSchemaField.fromAvroField(avroField);
+ }
Review Comment:
Can `createNewSchemaField(String name, HoodieSchema schema, String doc,
Object defaultValue)` reuse the logic here?
##########
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaUtils.java:
##########
@@ -79,6 +79,24 @@ public static HoodieSchema createHoodieWriteSchema(String
schema, boolean withOp
return HoodieSchema.fromAvroSchema(avroSchema);
}
+ /**
+ * Creates a write schema for Hudi operations, adding necessary metadata
fields.
+ * This is equivalent to HoodieAvroUtils.createHoodieWriteSchema() but
operates on HoodieSchema.
+ *
+ * @param schema the base HoodieSchema
+ * @param withOperationField whether to include operation metadata field
+ * @return HoodieSchema configured for write operations with metadata fields
added
+ * @throws IllegalArgumentException if schema is null
+ * @since 1.2.0
+ */
+ public static HoodieSchema createHoodieWriteSchema(HoodieSchema schema,
boolean withOperationField) {
+ ValidationUtils.checkArgument(schema != null, "Schema cannot be null");
+
+ // Convert to Avro, delegate to existing utility, convert back
+ Schema avroSchema =
HoodieAvroUtils.addMetadataFields(schema.toAvroSchema(), withOperationField);
Review Comment:
Should we have a util to add meta fields to `HoodieSchema`? Looks like
`HoodieSchemaUtils#addMetadataFields` or `HoodieSchema#addMetadataFields`
already does the same (can we also remove the duplicate method)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]