rahil-c commented on code in PR #18190:
URL: https://github.com/apache/hudi/pull/18190#discussion_r2893079235
##########
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchema.java:
##########
@@ -84,18 +86,118 @@
*/
public class HoodieSchema implements Serializable {
private static final long serialVersionUID = 1L;
+
/**
* Constant representing a null JSON value, equivalent to
JsonProperties.NULL_VALUE.
* This provides compatibility with Avro's JsonProperties while maintaining
Hudi's API.
*/
public static final Object NULL_VALUE = JsonProperties.NULL_VALUE;
public static final HoodieSchema NULL_SCHEMA =
HoodieSchema.create(HoodieSchemaType.NULL);
-
/**
* Constant to use when attaching type metadata to external schema systems
like Spark's StructType.
+ * Stores a parameterized type string for custom Hudi logical types such as
VECTOR and BLOB.
+ * Examples: "VECTOR(128)", "VECTOR(512, DOUBLE)", "BLOB".
*/
public static final String TYPE_METADATA_FIELD = "hudi_type";
+ /**
+ * Converts a HoodieSchema to its parameterized type string for custom Hudi
logical types
+ * such as VECTOR and BLOB. Only supports custom logical types — throws for
standard types.
+ * Parameterized types include positional parameters: "VECTOR(128)",
"VECTOR(128, DOUBLE)".
+ * Default parameters are omitted: VECTOR(dim) implies elementType=FLOAT.
+ */
+ public String toTypeString() {
+ HoodieSchemaType type = getType();
+ switch (type) {
+ case VECTOR:
+ Vector v = (Vector) this;
+ if (v.getVectorElementType() == Vector.VectorElementType.FLOAT) {
+ return "VECTOR(" + v.getDimension() + ")";
+ }
+ return "VECTOR(" + v.getDimension() + ", " + v.getVectorElementType()
+ ")";
+ case BLOB:
+ return "BLOB";
+ default:
+ throw new IllegalArgumentException(
+ "toTypeString only supports custom logical types, got: " + type);
+ }
+ }
Review Comment:
Originally i had this as a static method but Tim recommendation was to make
these instance methods
https://github.com/apache/hudi/pull/18190#discussion_r2850316235
I agree with that calling this `toTypeString` on a `intHoodieSchema` feels
unintuitive since a majority of the type it would throw an exception expect for
vector and blob.
I am ok with moving this back to static method for now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]