Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-31 Thread via GitHub


voonhous merged PR #17599:
URL: https://github.com/apache/hudi/pull/17599


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-31 Thread via GitHub


voonhous commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701925766

   https://github.com/user-attachments/assets/27a7c5c0-e7a2-4670-9d1b-bcfeebf4e181";
 />
   
   CI green, merging this in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701541447

   
   ## CI report:
   
   * c94921b440173563513300ba1f7724a8374c5e49 Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10700)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701407694

   
   ## CI report:
   
   * 82240333bb0fee8d1135f2ad26e246ede63255b6 Azure: 
[CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10698)
 
   * c94921b440173563513300ba1f7724a8374c5e49 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10700)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701389228

   
   ## CI report:
   
   * 82240333bb0fee8d1135f2ad26e246ede63255b6 Azure: 
[CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10698)
 
   * c94921b440173563513300ba1f7724a8374c5e49 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701387916

   
   ## CI report:
   
   * a4939d12b7d916e8ca37010f77feead24ec85328 Azure: 
[CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10696)
 
   * 82240333bb0fee8d1135f2ad26e246ede63255b6 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10698)
 
   * c94921b440173563513300ba1f7724a8374c5e49 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701333812

   
   ## CI report:
   
   * a4939d12b7d916e8ca37010f77feead24ec85328 Azure: 
[CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10696)
 
   * 82240333bb0fee8d1135f2ad26e246ede63255b6 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10698)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701284472

   
   ## CI report:
   
   * a4939d12b7d916e8ca37010f77feead24ec85328 Azure: 
[CANCELED](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10696)
 
   * 82240333bb0fee8d1135f2ad26e246ede63255b6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2654654521


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  if (nestedFieldOpt.isEmpty()) {
+throw new HoodieException("Failed to get schema. Not a valid field 
name: " + fieldToIndex);
+  }
+  HoodieSchema fieldSchema = nestedFieldOpt.get().getRight().schema();
+  return fieldSchema.getType() != HoodieSchemaType.RECORD && 
fieldSchema.getType() != HoodieSchemaType.ARRAY && fieldSchema.getType() != 
HoodieSchemaType.MAP;
 });
   }
 
   /**
* Check if the given schema type is supported for secondary index.
* Supported types are: String (including CHAR), Integer types (Int, BigInt, 
Long, Short), and timestamp
*/
-  private static boolean isSecondaryIndexSupportedType(Schema schema) {
+  private static boolean isSecondaryIndexSupportedType(HoodieSchema schema) {
 // Handle union types (nullable fields)
-if (schema.getType() == Schema.Type.UNION) {
+if (schema.getType() == HoodieSchemaType.UNION) {
   // For union types, check if any of the types is supported
   return schema.getTypes().stream()
-  .anyMatch(s -> s.getType() != Schema.Type.NULL && 
isSecondaryIndexSupportedType(s));
+  .anyMatch(s -> s.getType() != HoodieSchemaType.NULL && 
isSecondaryIndexSupportedType(s));
 }
 
 // Check basic types
 switch (schema.getType()) {
   case STRING:
-// STRING type can have UUID logical type which we don't support
-return schema.getLogicalType() == null; // UUID and other string-based 
logical types are not supported
-  // Regular STRING (includes CHAR)
   case INT:
-// INT type can represent regular integers or dates/times with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support date and time-millis logical types
-  return schema.getLogicalType() == LogicalTypes.date()
-  || schema.getLogicalType() == LogicalTypes.timeMillis();
-}
-return true; // Regular INT
   case LONG:
-// LONG type can represent regular longs or timestamps with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support timestamp logical types
-  return schema.getLogicalType() == LogicalTypes.timestampMillis()
-  || schema.getLogicalType() == LogicalTypes.timestampMicros()
-  || schema.getLogicalType() == LogicalTypes.timeMicros();
-}
-return true; // Regular LONG
   case DOUBLE:
-return true; // Support DOUBLE type
+  case DATE:

Review Comment:
   Sure, fixed the test and enabled Float support.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701273069

   
   ## CI report:
   
   * a88ae9de253ea06e1d440df4539934614cff2392 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10681)
 
   * a4939d12b7d916e8ca37010f77feead24ec85328 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10696)
 
   * 82240333bb0fee8d1135f2ad26e246ede63255b6 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701269966

   
   ## CI report:
   
   * a88ae9de253ea06e1d440df4539934614cff2392 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10681)
 
   * a4939d12b7d916e8ca37010f77feead24ec85328 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10696)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3701266971

   
   ## CI report:
   
   * a88ae9de253ea06e1d440df4539934614cff2392 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10681)
 
   * a4939d12b7d916e8ca37010f77feead24ec85328 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-378152

   
   ## CI report:
   
   * a88ae9de253ea06e1d440df4539934614cff2392 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10681)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3699982394

   
   ## CI report:
   
   * ab69600734f37d289522abf866b6077f3bd97132 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10678)
 
   * a88ae9de253ea06e1d440df4539934614cff2392 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3699860887

   
   ## CI report:
   
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   * ab69600734f37d289522abf866b6077f3bd97132 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10678)
 
   * a88ae9de253ea06e1d440df4539934614cff2392 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3699857714

   
   ## CI report:
   
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   * ab69600734f37d289522abf866b6077f3bd97132 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10678)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3699854611

   
   ## CI report:
   
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   * ab69600734f37d289522abf866b6077f3bd97132 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3699045289

   
   ## CI report:
   
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698995601

   
   ## CI report:
   
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698991179

   
   ## CI report:
   
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-30 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698987141

   
   ## CI report:
   
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


linliu-code commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652402940


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  if (nestedFieldOpt.isEmpty()) {
+throw new HoodieException("Failed to get schema. Not a valid field 
name: " + fieldToIndex);
+  }
+  HoodieSchema fieldSchema = nestedFieldOpt.get().getRight().schema();
+  return fieldSchema.getType() != HoodieSchemaType.RECORD && 
fieldSchema.getType() != HoodieSchemaType.ARRAY && fieldSchema.getType() != 
HoodieSchemaType.MAP;
 });
   }
 
   /**
* Check if the given schema type is supported for secondary index.
* Supported types are: String (including CHAR), Integer types (Int, BigInt, 
Long, Short), and timestamp
*/
-  private static boolean isSecondaryIndexSupportedType(Schema schema) {
+  private static boolean isSecondaryIndexSupportedType(HoodieSchema schema) {
 // Handle union types (nullable fields)
-if (schema.getType() == Schema.Type.UNION) {
+if (schema.getType() == HoodieSchemaType.UNION) {
   // For union types, check if any of the types is supported
   return schema.getTypes().stream()
-  .anyMatch(s -> s.getType() != Schema.Type.NULL && 
isSecondaryIndexSupportedType(s));
+  .anyMatch(s -> s.getType() != HoodieSchemaType.NULL && 
isSecondaryIndexSupportedType(s));
 }
 
 // Check basic types
 switch (schema.getType()) {
   case STRING:
-// STRING type can have UUID logical type which we don't support
-return schema.getLogicalType() == null; // UUID and other string-based 
logical types are not supported
-  // Regular STRING (includes CHAR)
   case INT:
-// INT type can represent regular integers or dates/times with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support date and time-millis logical types
-  return schema.getLogicalType() == LogicalTypes.date()
-  || schema.getLogicalType() == LogicalTypes.timeMillis();
-}
-return true; // Regular INT
   case LONG:
-// LONG type can represent regular longs or timestamps with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support timestamp logical types
-  return schema.getLogicalType() == LogicalTypes.timestampMillis()
-  || schema.getLogicalType() == LogicalTypes.timestampMicros()
-  || schema.getLogicalType() == LogicalTypes.timeMicros();
-}
-return true; // Regular LONG
   case DOUBLE:
-return true; // Support DOUBLE type
+  case DATE:

Review Comment:
   @voonhous , i do not see there are any reason that Float cannot be 
supported. Please add a Float case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698346074

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698325731

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * c47715467f0be2e1216433fcc94981541a93ee7c Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10663)
 
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698295323

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 37cd2425506e26ab045e2882dc9b88978c09bec2 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10639)
 
   * c47715467f0be2e1216433fcc94981541a93ee7c Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10663)
 
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10665)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698258658

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 37cd2425506e26ab045e2882dc9b88978c09bec2 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10639)
 
   * c47715467f0be2e1216433fcc94981541a93ee7c Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10663)
 
   * 3bd409f28d01659e6537e967b1263867f7cbcb3a UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698205795

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 37cd2425506e26ab045e2882dc9b88978c09bec2 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10639)
 
   * c47715467f0be2e1216433fcc94981541a93ee7c Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10663)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3698201830

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 37cd2425506e26ab045e2882dc9b88978c09bec2 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10639)
 
   * c47715467f0be2e1216433fcc94981541a93ee7c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652133424


##
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/index/TestHoodieIndexUtils.java:
##
@@ -227,32 +218,31 @@ public void testValidateDataTypeForSecondaryIndex() {
   @Test
   public void testValidateDataTypeForSecondaryIndexWithLogicalTypes() {
 // Supported logical types
-Schema timestampMillis = 
LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
-Schema timestampMicros = 
LogicalTypes.timestampMicros().addToSchema(Schema.create(Schema.Type.LONG));
-Schema date = 
LogicalTypes.date().addToSchema(Schema.create(Schema.Type.INT));
-Schema timeMillis = 
LogicalTypes.timeMillis().addToSchema(Schema.create(Schema.Type.INT));
-Schema timeMicros = 
LogicalTypes.timeMicros().addToSchema(Schema.create(Schema.Type.LONG));
-
+HoodieSchema timestampMillis = HoodieSchema.createTimestampMillis();
+HoodieSchema timestampMicros = HoodieSchema.createTimestampMicros();
+HoodieSchema date = HoodieSchema.createDate();
+HoodieSchema timeMillis = HoodieSchema.createTimeMillis();
+HoodieSchema timeMicros = HoodieSchema.createTimeMicros();
+
 // Unsupported logical types
-Schema decimal = LogicalTypes.decimal(10, 
2).addToSchema(Schema.create(Schema.Type.BYTES));
-Schema uuid = 
LogicalTypes.uuid().addToSchema(Schema.create(Schema.Type.STRING));
-Schema localTimestampMillis = 
LogicalTypes.localTimestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
-Schema localTimestampMicros = 
LogicalTypes.localTimestampMicros().addToSchema(Schema.create(Schema.Type.LONG));
-
-Schema schemaWithLogicalTypes = SchemaBuilder.record("TestRecord")
-.fields()
+HoodieSchema decimal = HoodieSchema.createDecimal(10, 2);
+HoodieSchema uuid = HoodieSchema.createUUID();
+HoodieSchema localTimestampMillis = 
HoodieSchema.createLocalTimestampMillis();
+HoodieSchema localTimestampMicros = 
HoodieSchema.createLocalTimestampMicros();
+
+HoodieSchema schemaWithLogicalTypes = 
HoodieSchema.createRecord("TestRecord", null, null, Arrays.asList(
 // Supported logical types
-.name("timestampMillisField").type(timestampMillis).noDefault()
-.name("timestampMicrosField").type(timestampMicros).noDefault()
-.name("dateField").type(date).noDefault()
-.name("timeMillisField").type(timeMillis).noDefault()
-.name("timeMicrosField").type(timeMicros).noDefault()
+HoodieSchemaField.of("timestampMillisField", timestampMillis),
+HoodieSchemaField.of("timestampMicrosField", timestampMicros),
+HoodieSchemaField.of("dateField", date),
+HoodieSchemaField.of("timeMillisField", timeMillis),
+HoodieSchemaField.of("timeMicrosField", timeMicros),
 // Unsupported logical types
-.name("decimalField").type(decimal).noDefault()
-.name("uuidField").type(uuid).noDefault()
-
.name("localTimestampMillisField").type(localTimestampMillis).noDefault()
-
.name("localTimestampMicrosField").type(localTimestampMicros).noDefault()
-.endRecord();
+HoodieSchemaField.of("decimalField", decimal),

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652132379


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  if (nestedFieldOpt.isEmpty()) {
+throw new HoodieException("Failed to get schema. Not a valid field 
name: " + fieldToIndex);
+  }
+  HoodieSchema fieldSchema = nestedFieldOpt.get().getRight().schema();
+  return fieldSchema.getType() != HoodieSchemaType.RECORD && 
fieldSchema.getType() != HoodieSchemaType.ARRAY && fieldSchema.getType() != 
HoodieSchemaType.MAP;
 });
   }
 
   /**
* Check if the given schema type is supported for secondary index.
* Supported types are: String (including CHAR), Integer types (Int, BigInt, 
Long, Short), and timestamp
*/
-  private static boolean isSecondaryIndexSupportedType(Schema schema) {
+  private static boolean isSecondaryIndexSupportedType(HoodieSchema schema) {
 // Handle union types (nullable fields)
-if (schema.getType() == Schema.Type.UNION) {
+if (schema.getType() == HoodieSchemaType.UNION) {
   // For union types, check if any of the types is supported
   return schema.getTypes().stream()
-  .anyMatch(s -> s.getType() != Schema.Type.NULL && 
isSecondaryIndexSupportedType(s));
+  .anyMatch(s -> s.getType() != HoodieSchemaType.NULL && 
isSecondaryIndexSupportedType(s));
 }
 
 // Check basic types
 switch (schema.getType()) {
   case STRING:
-// STRING type can have UUID logical type which we don't support
-return schema.getLogicalType() == null; // UUID and other string-based 
logical types are not supported
-  // Regular STRING (includes CHAR)
   case INT:
-// INT type can represent regular integers or dates/times with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support date and time-millis logical types
-  return schema.getLogicalType() == LogicalTypes.date()
-  || schema.getLogicalType() == LogicalTypes.timeMillis();
-}
-return true; // Regular INT
   case LONG:
-// LONG type can represent regular longs or timestamps with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support timestamp logical types
-  return schema.getLogicalType() == LogicalTypes.timestampMillis()
-  || schema.getLogicalType() == LogicalTypes.timestampMicros()
-  || schema.getLogicalType() == LogicalTypes.timeMicros();
-}
-return true; // Regular LONG
   case DOUBLE:
-return true; // Support DOUBLE type
+  case DATE:

Review Comment:
   Since comment and configs suggests that **Float** is supported, i will add a 
`case Float`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652130593


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  if (nestedFieldOpt.isEmpty()) {
+throw new HoodieException("Failed to get schema. Not a valid field 
name: " + fieldToIndex);
+  }
+  HoodieSchema fieldSchema = nestedFieldOpt.get().getRight().schema();
+  return fieldSchema.getType() != HoodieSchemaType.RECORD && 
fieldSchema.getType() != HoodieSchemaType.ARRAY && fieldSchema.getType() != 
HoodieSchemaType.MAP;
 });
   }
 
   /**
* Check if the given schema type is supported for secondary index.
* Supported types are: String (including CHAR), Integer types (Int, BigInt, 
Long, Short), and timestamp
*/
-  private static boolean isSecondaryIndexSupportedType(Schema schema) {
+  private static boolean isSecondaryIndexSupportedType(HoodieSchema schema) {
 // Handle union types (nullable fields)
-if (schema.getType() == Schema.Type.UNION) {
+if (schema.getType() == HoodieSchemaType.UNION) {
   // For union types, check if any of the types is supported
   return schema.getTypes().stream()
-  .anyMatch(s -> s.getType() != Schema.Type.NULL && 
isSecondaryIndexSupportedType(s));
+  .anyMatch(s -> s.getType() != HoodieSchemaType.NULL && 
isSecondaryIndexSupportedType(s));
 }
 
 // Check basic types
 switch (schema.getType()) {
   case STRING:
-// STRING type can have UUID logical type which we don't support
-return schema.getLogicalType() == null; // UUID and other string-based 
logical types are not supported
-  // Regular STRING (includes CHAR)
   case INT:
-// INT type can represent regular integers or dates/times with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support date and time-millis logical types
-  return schema.getLogicalType() == LogicalTypes.date()
-  || schema.getLogicalType() == LogicalTypes.timeMillis();
-}
-return true; // Regular INT
   case LONG:
-// LONG type can represent regular longs or timestamps with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support timestamp logical types
-  return schema.getLogicalType() == LogicalTypes.timestampMillis()
-  || schema.getLogicalType() == LogicalTypes.timestampMicros()
-  || schema.getLogicalType() == LogicalTypes.timeMicros();
-}
-return true; // Regular LONG
   case DOUBLE:
-return true; // Support DOUBLE type
+  case DATE:

Review Comment:
   I'll update the comments to remove **FLOAT** from the support matrix as i 
can't find any supporting docs or examples that we support float. 
   
   Pls CMIIW.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652130593


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  if (nestedFieldOpt.isEmpty()) {
+throw new HoodieException("Failed to get schema. Not a valid field 
name: " + fieldToIndex);
+  }
+  HoodieSchema fieldSchema = nestedFieldOpt.get().getRight().schema();
+  return fieldSchema.getType() != HoodieSchemaType.RECORD && 
fieldSchema.getType() != HoodieSchemaType.ARRAY && fieldSchema.getType() != 
HoodieSchemaType.MAP;
 });
   }
 
   /**
* Check if the given schema type is supported for secondary index.
* Supported types are: String (including CHAR), Integer types (Int, BigInt, 
Long, Short), and timestamp
*/
-  private static boolean isSecondaryIndexSupportedType(Schema schema) {
+  private static boolean isSecondaryIndexSupportedType(HoodieSchema schema) {
 // Handle union types (nullable fields)
-if (schema.getType() == Schema.Type.UNION) {
+if (schema.getType() == HoodieSchemaType.UNION) {
   // For union types, check if any of the types is supported
   return schema.getTypes().stream()
-  .anyMatch(s -> s.getType() != Schema.Type.NULL && 
isSecondaryIndexSupportedType(s));
+  .anyMatch(s -> s.getType() != HoodieSchemaType.NULL && 
isSecondaryIndexSupportedType(s));
 }
 
 // Check basic types
 switch (schema.getType()) {
   case STRING:
-// STRING type can have UUID logical type which we don't support
-return schema.getLogicalType() == null; // UUID and other string-based 
logical types are not supported
-  // Regular STRING (includes CHAR)
   case INT:
-// INT type can represent regular integers or dates/times with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support date and time-millis logical types
-  return schema.getLogicalType() == LogicalTypes.date()
-  || schema.getLogicalType() == LogicalTypes.timeMillis();
-}
-return true; // Regular INT
   case LONG:
-// LONG type can represent regular longs or timestamps with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support timestamp logical types
-  return schema.getLogicalType() == LogicalTypes.timestampMillis()
-  || schema.getLogicalType() == LogicalTypes.timestampMicros()
-  || schema.getLogicalType() == LogicalTypes.timeMicros();
-}
-return true; // Regular LONG
   case DOUBLE:
-return true; // Support DOUBLE type
+  case DATE:

Review Comment:
   I'll update the comments to remove **FLOAT** from the support matrix as i 
can't find any supporting docs or examples that we support float. 
   
   Pls CMIIW.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652123521


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  if (nestedFieldOpt.isEmpty()) {
+throw new HoodieException("Failed to get schema. Not a valid field 
name: " + fieldToIndex);
+  }
+  HoodieSchema fieldSchema = nestedFieldOpt.get().getRight().schema();
+  return fieldSchema.getType() != HoodieSchemaType.RECORD && 
fieldSchema.getType() != HoodieSchemaType.ARRAY && fieldSchema.getType() != 
HoodieSchemaType.MAP;
 });
   }
 
   /**
* Check if the given schema type is supported for secondary index.
* Supported types are: String (including CHAR), Integer types (Int, BigInt, 
Long, Short), and timestamp
*/
-  private static boolean isSecondaryIndexSupportedType(Schema schema) {
+  private static boolean isSecondaryIndexSupportedType(HoodieSchema schema) {
 // Handle union types (nullable fields)
-if (schema.getType() == Schema.Type.UNION) {
+if (schema.getType() == HoodieSchemaType.UNION) {
   // For union types, check if any of the types is supported
   return schema.getTypes().stream()
-  .anyMatch(s -> s.getType() != Schema.Type.NULL && 
isSecondaryIndexSupportedType(s));
+  .anyMatch(s -> s.getType() != HoodieSchemaType.NULL && 
isSecondaryIndexSupportedType(s));
 }
 
 // Check basic types
 switch (schema.getType()) {
   case STRING:
-// STRING type can have UUID logical type which we don't support
-return schema.getLogicalType() == null; // UUID and other string-based 
logical types are not supported
-  // Regular STRING (includes CHAR)
   case INT:
-// INT type can represent regular integers or dates/times with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support date and time-millis logical types
-  return schema.getLogicalType() == LogicalTypes.date()
-  || schema.getLogicalType() == LogicalTypes.timeMillis();
-}
-return true; // Regular INT
   case LONG:
-// LONG type can represent regular longs or timestamps with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support timestamp logical types
-  return schema.getLogicalType() == LogicalTypes.timestampMillis()
-  || schema.getLogicalType() == LogicalTypes.timestampMicros()
-  || schema.getLogicalType() == LogicalTypes.timeMicros();
-}
-return true; // Regular LONG
   case DOUBLE:
-return true; // Support DOUBLE type
+  case DATE:

Review Comment:
   Let me verify this, the comment is a little confusing. It says: `Float and 
Double are now supported`, but the test itself for **Float** is a test for 
`assertThrows` but for **Double**, it's a `assertDoesNotThrow`. Might need to 
check separately with @linliu-code what's going on here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652118691


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {

Review Comment:
   Sure, will add a sub task for this.
   
   https://github.com/apache/hudi/issues/17750



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652118691


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {

Review Comment:
   Sure, will add a sub task for this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652115337


##
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestConcurrentSchemaEvolutionTableSchemaGetter.java:
##
@@ -383,7 +383,7 @@ private static Stream schemaTestParams() {
 
   @ParameterizedTest
   @MethodSource("schemaTestParams")
-  void testGetTableAvroSchema(Schema inputSchema, boolean 
includeMetadataFields, Schema expectedSchema) throws Exception {
+  void testGetTableAvroSchema(HoodieSchema inputSchema, boolean 
includeMetadataFields, HoodieSchema expectedSchema) throws Exception {

Review Comment:
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652115123


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import lombok.AccessLevel;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+@NoArgsConstructor(access = AccessLevel.PRIVATE)
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DATE:
+// DATE is INT with date logical type (no additional properties)
+  case UUID:
+// UUID is STRING with uuid logical type (no additional properties)
+return true;
+  case DECIMAL:
+return decimalSchemaEquals(s1, s2);
+  case TIME:
+return timeSchemaEquals(s1, s2);
+  case TIMESTAMP:
+return timestampSchemaEquals(s1, s2);
+  default:
+throw new IllegalArgumentException("Unknown schema type: " + 
s1.getType());
+}
+  }
+
+  protected boolean validateRecord(HoodieSchema s1, HoodieSchema s2) {
+return s1.isError() == s2.isError();
+  }
+
+  private boolean recordSchemaEquals(HoodieSchema s1, HoodieSchema s2) {
+if (!validateRecord(s1, s2)) {
+  return false;
+}
+
+List 

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652114301


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import lombok.AccessLevel;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+@NoArgsConstructor(access = AccessLevel.PRIVATE)
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DATE:
+// DATE is INT with date logical type (no additional properties)
+  case UUID:
+// UUID is STRING with uuid logical type (no additional properties)
+return true;
+  case DECIMAL:
+return decimalSchemaEquals(s1, s2);
+  case TIME:
+return timeSchemaEquals(s1, s2);
+  case TIMESTAMP:
+return timestampSchemaEquals(s1, s2);
+  default:
+throw new IllegalArgumentException("Unknown schema type: " + 
s1.getType());
+}
+  }
+
+  protected boolean validateRecord(HoodieSchema s1, HoodieSchema s2) {
+return s1.isError() == s2.isError();
+  }
+
+  private boolean recordSchemaEquals(HoodieSchema s1, HoodieSchema s2) {
+if (!validateRecord(s1, s2)) {
+  return false;
+}
+
+List 

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652106830


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -136,10 +136,13 @@ public static List 
getLatestBaseFilesForPartition(String partiti
* @param tableSchema  table schema
* @return true if each field's data type are supported for secondary index, 
false otherwise
*/
-  static boolean validateDataTypeForSecondaryIndex(List sourceFields, 
Schema tableSchema) {
+  static boolean validateDataTypeForSecondaryIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().allMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return isSecondaryIndexSupportedType(schema);
+  Option> schema = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  if (schema.isEmpty()) {
+throw new HoodieException("Failed to get schema. Not a valid field 
name: " + fieldToIndex);
+  }

Review Comment:
   Good idea!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652104980


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import lombok.AccessLevel;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+@NoArgsConstructor(access = AccessLevel.PRIVATE)
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DATE:
+// DATE is INT with date logical type (no additional properties)
+  case UUID:
+// UUID is STRING with uuid logical type (no additional properties)
+return true;
+  case DECIMAL:
+return decimalSchemaEquals(s1, s2);
+  case TIME:
+return timeSchemaEquals(s1, s2);
+  case TIMESTAMP:
+return timestampSchemaEquals(s1, s2);
+  default:
+throw new IllegalArgumentException("Unknown schema type: " + 
s1.getType());
+}
+  }
+
+  protected boolean validateRecord(HoodieSchema s1, HoodieSchema s2) {
+return s1.isError() == s2.isError();
+  }
+
+  private boolean recordSchemaEquals(HoodieSchema s1, HoodieSchema s2) {
+if (!validateRecord(s1, s2)) {
+  return false;
+}
+
+List 

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652104144


##
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/index/TestHoodieIndexUtils.java:
##
@@ -544,18 +520,17 @@ public void testIsEligibleForExpressionIndex() {
*/
   @Test
   public void testIsEligibleForExpressionIndexWithNullableFields() {
+// An int with default 0 must have the int type defined first.
+// If null is defined first, which HoodieSchema#createNullable does, an 
error will be thrown
+HoodieSchema nullableIntWithDefault = 
HoodieSchema.createUnion(HoodieSchema.create(HoodieSchemaType.INT), 
HoodieSchema.create(HoodieSchemaType.NULL));

Review Comment:
   Nope, it's a restriction of Avro, here's the error if we use 
`HoodieSchema.createNullable`:
   
   https://github.com/user-attachments/assets/f2d81386-a308-4154-aa3d-dbb906769927";
 />
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652102940


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/bootstrap/HoodieBootstrapSchemaProvider.java:
##
@@ -45,11 +44,11 @@ public HoodieBootstrapSchemaProvider(HoodieWriteConfig 
writeConfig) {
* @param partitions  List of partitions with files within them
* @return Avro Schema
*/
-  public final Schema getBootstrapSchema(HoodieEngineContext context, 
List>> partitions) {
+  public final HoodieSchema getBootstrapSchema(HoodieEngineContext context, 
List>> partitions) {
 if (writeConfig.getSchema() != null) {
   // Use schema specified by user if set
-  Schema userSchema = new Schema.Parser().parse(writeConfig.getSchema());
-  if (!HoodieAvroUtils.getNullSchema().equals(userSchema)) {
+  HoodieSchema userSchema = HoodieSchema.parse(writeConfig.getSchema());
+  if (!HoodieSchema.create(HoodieSchemaType.NULL).equals(userSchema)) {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652099583


##
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/index/TestHoodieIndexUtils.java:
##
@@ -544,18 +520,17 @@ public void testIsEligibleForExpressionIndex() {
*/
   @Test
   public void testIsEligibleForExpressionIndexWithNullableFields() {
+// An int with default 0 must have the int type defined first.

Review Comment:
   Yes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


yihua commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652034388


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import lombok.AccessLevel;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+@NoArgsConstructor(access = AccessLevel.PRIVATE)
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DATE:
+// DATE is INT with date logical type (no additional properties)
+  case UUID:
+// UUID is STRING with uuid logical type (no additional properties)
+return true;
+  case DECIMAL:
+return decimalSchemaEquals(s1, s2);
+  case TIME:
+return timeSchemaEquals(s1, s2);
+  case TIMESTAMP:
+return timestampSchemaEquals(s1, s2);
+  default:
+throw new IllegalArgumentException("Unknown schema type: " + 
s1.getType());
+}
+  }
+
+  protected boolean validateRecord(HoodieSchema s1, HoodieSchema s2) {
+return s1.isError() == s2.isError();
+  }
+
+  private boolean recordSchemaEquals(HoodieSchema s1, HoodieSchema s2) {
+if (!validateRecord(s1, s2)) {
+  return false;
+}
+
+List fie

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


yihua commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2652018240


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +153,41 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  if (nestedFieldOpt.isEmpty()) {
+throw new HoodieException("Failed to get schema. Not a valid field 
name: " + fieldToIndex);
+  }
+  HoodieSchema fieldSchema = nestedFieldOpt.get().getRight().schema();
+  return fieldSchema.getType() != HoodieSchemaType.RECORD && 
fieldSchema.getType() != HoodieSchemaType.ARRAY && fieldSchema.getType() != 
HoodieSchemaType.MAP;
 });
   }
 
   /**
* Check if the given schema type is supported for secondary index.
* Supported types are: String (including CHAR), Integer types (Int, BigInt, 
Long, Short), and timestamp
*/
-  private static boolean isSecondaryIndexSupportedType(Schema schema) {
+  private static boolean isSecondaryIndexSupportedType(HoodieSchema schema) {
 // Handle union types (nullable fields)
-if (schema.getType() == Schema.Type.UNION) {
+if (schema.getType() == HoodieSchemaType.UNION) {
   // For union types, check if any of the types is supported
   return schema.getTypes().stream()
-  .anyMatch(s -> s.getType() != Schema.Type.NULL && 
isSecondaryIndexSupportedType(s));
+  .anyMatch(s -> s.getType() != HoodieSchemaType.NULL && 
isSecondaryIndexSupportedType(s));
 }
 
 // Check basic types
 switch (schema.getType()) {
   case STRING:
-// STRING type can have UUID logical type which we don't support
-return schema.getLogicalType() == null; // UUID and other string-based 
logical types are not supported
-  // Regular STRING (includes CHAR)
   case INT:
-// INT type can represent regular integers or dates/times with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support date and time-millis logical types
-  return schema.getLogicalType() == LogicalTypes.date()
-  || schema.getLogicalType() == LogicalTypes.timeMillis();
-}
-return true; // Regular INT
   case LONG:
-// LONG type can represent regular longs or timestamps with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support timestamp logical types
-  return schema.getLogicalType() == LogicalTypes.timestampMillis()
-  || schema.getLogicalType() == LogicalTypes.timestampMicros()
-  || schema.getLogicalType() == LogicalTypes.timeMicros();
-}
-return true; // Regular LONG
   case DOUBLE:
-return true; // Support DOUBLE type
+  case DATE:

Review Comment:
   Test says `FLOAT` is supported but it seems that `FLOAT` type check is 
missing here?
   ```
 public void testIsEligibleForSecondaryIndexWithUnsupportedDataTypes() {
   // Given: A schema with unsupported data types for secondary index 
(Boolean, Decimal)
   // Note: Float and Double are now supported
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


yihua commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2651937885


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,346 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import lombok.AccessLevel;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+@NoArgsConstructor(access = AccessLevel.PRIVATE)
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DATE:
+// DATE is INT with date logical type (no additional properties)
+  case UUID:
+// UUID is STRING with uuid logical type (no additional properties)
+return true;
+  case DECIMAL:
+return decimalSchemaEquals(s1, s2);
+  case TIME:
+return timeSchemaEquals(s1, s2);
+  case TIMESTAMP:
+return timestampSchemaEquals(s1, s2);
+  default:
+throw new IllegalArgumentException("Unknown schema type: " + 
s1.getType());
+}
+  }
+
+  protected boolean validateRecord(HoodieSchema s1, HoodieSchema s2) {
+return s1.isError() == s2.isError();
+  }
+
+  private boolean recordSchemaEquals(HoodieSchema s1, HoodieSchema s2) {
+if (!validateRecord(s1, s2)) {
+  return false;
+}
+
+List fie

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


yihua commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2651904639


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/bootstrap/HoodieBootstrapSchemaProvider.java:
##
@@ -45,11 +44,11 @@ public HoodieBootstrapSchemaProvider(HoodieWriteConfig 
writeConfig) {
* @param partitions  List of partitions with files within them
* @return Avro Schema
*/
-  public final Schema getBootstrapSchema(HoodieEngineContext context, 
List>> partitions) {
+  public final HoodieSchema getBootstrapSchema(HoodieEngineContext context, 
List>> partitions) {
 if (writeConfig.getSchema() != null) {
   // Use schema specified by user if set
-  Schema userSchema = new Schema.Parser().parse(writeConfig.getSchema());
-  if (!HoodieAvroUtils.getNullSchema().equals(userSchema)) {
+  HoodieSchema userSchema = HoodieSchema.parse(writeConfig.getSchema());
+  if (!HoodieSchema.create(HoodieSchemaType.NULL).equals(userSchema)) {

Review Comment:
   nit: reuse `HoodieSchema.NULL_SCHEMA`



##
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/index/TestHoodieIndexUtils.java:
##
@@ -544,18 +520,17 @@ public void testIsEligibleForExpressionIndex() {
*/
   @Test
   public void testIsEligibleForExpressionIndexWithNullableFields() {
+// An int with default 0 must have the int type defined first.
+// If null is defined first, which HoodieSchema#createNullable does, an 
error will be thrown
+HoodieSchema nullableIntWithDefault = 
HoodieSchema.createUnion(HoodieSchema.create(HoodieSchemaType.INT), 
HoodieSchema.create(HoodieSchemaType.NULL));

Review Comment:
   Does `HoodieSchema.createNullable` work in this case, instead of calling 
`HoodieSchema.createUnion`?



##
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/index/TestHoodieIndexUtils.java:
##
@@ -227,32 +218,31 @@ public void testValidateDataTypeForSecondaryIndex() {
   @Test
   public void testValidateDataTypeForSecondaryIndexWithLogicalTypes() {
 // Supported logical types
-Schema timestampMillis = 
LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
-Schema timestampMicros = 
LogicalTypes.timestampMicros().addToSchema(Schema.create(Schema.Type.LONG));
-Schema date = 
LogicalTypes.date().addToSchema(Schema.create(Schema.Type.INT));
-Schema timeMillis = 
LogicalTypes.timeMillis().addToSchema(Schema.create(Schema.Type.INT));
-Schema timeMicros = 
LogicalTypes.timeMicros().addToSchema(Schema.create(Schema.Type.LONG));
-
+HoodieSchema timestampMillis = HoodieSchema.createTimestampMillis();
+HoodieSchema timestampMicros = HoodieSchema.createTimestampMicros();
+HoodieSchema date = HoodieSchema.createDate();
+HoodieSchema timeMillis = HoodieSchema.createTimeMillis();
+HoodieSchema timeMicros = HoodieSchema.createTimeMicros();
+
 // Unsupported logical types
-Schema decimal = LogicalTypes.decimal(10, 
2).addToSchema(Schema.create(Schema.Type.BYTES));
-Schema uuid = 
LogicalTypes.uuid().addToSchema(Schema.create(Schema.Type.STRING));
-Schema localTimestampMillis = 
LogicalTypes.localTimestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
-Schema localTimestampMicros = 
LogicalTypes.localTimestampMicros().addToSchema(Schema.create(Schema.Type.LONG));
-
-Schema schemaWithLogicalTypes = SchemaBuilder.record("TestRecord")
-.fields()
+HoodieSchema decimal = HoodieSchema.createDecimal(10, 2);
+HoodieSchema uuid = HoodieSchema.createUUID();
+HoodieSchema localTimestampMillis = 
HoodieSchema.createLocalTimestampMillis();
+HoodieSchema localTimestampMicros = 
HoodieSchema.createLocalTimestampMicros();
+
+HoodieSchema schemaWithLogicalTypes = 
HoodieSchema.createRecord("TestRecord", null, null, Arrays.asList(
 // Supported logical types
-.name("timestampMillisField").type(timestampMillis).noDefault()
-.name("timestampMicrosField").type(timestampMicros).noDefault()
-.name("dateField").type(date).noDefault()
-.name("timeMillisField").type(timeMillis).noDefault()
-.name("timeMicrosField").type(timeMicros).noDefault()
+HoodieSchemaField.of("timestampMillisField", timestampMillis),
+HoodieSchemaField.of("timestampMicrosField", timestampMicros),
+HoodieSchemaField.of("dateField", date),
+HoodieSchemaField.of("timeMillisField", timeMillis),
+HoodieSchemaField.of("timeMicrosField", timeMicros),
 // Unsupported logical types
-.name("decimalField").type(decimal).noDefault()
-.name("uuidField").type(uuid).noDefault()
-
.name("localTimestampMillisField").type(localTimestampMillis).noDefault()
-
.name("localTimestampMicrosField").type(localTimestampMicros).noDefault()
-.endRecord();
+HoodieSchemaField.of("decimalField", decimal),

Review Comment:
   Should we test 

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3697470428

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 37cd2425506e26ab045e2882dc9b88978c09bec2 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10639)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3697328974

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 1235700739b332034cea98850fda187c4e599c41 Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10627)
 
   * 37cd2425506e26ab045e2882dc9b88978c09bec2 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10639)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3697322864

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 1235700739b332034cea98850fda187c4e599c41 Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10627)
 
   * 37cd2425506e26ab045e2882dc9b88978c09bec2 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2651572665


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,371 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {
+  }
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+return primitiveSchemaEquals(s1, s2);
+  case DECIMAL:
+return decimalSchemaEquals(s1, s2);
+  case TIME:
+return timeSchemaEquals(s1, s2);
+  case TIMESTAMP:
+return timestampSchemaEquals(s1, s2);
+  case DATE:
+  case UUID:
+return logicalTypeSchemaEquals(s1, s2);

Review Comment:
   Done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2651570751


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,371 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2651569979


##
hudi-common/src/test/java/org/apache/hudi/common/schema/TestHoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,505 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import org.apache.hudi.io.util.FileIOUtils;
+
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertFalse;
+import static org.junit.jupiter.api.Assertions.assertNotEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+
+class TestHoodieSchemaComparatorForSchemaEvolution {
+  @Test
+  void testAttrsIrrelevantToEquality() throws IOException {
+// Validates that schemas with different non-essential attributes (like 
doc strings or aliases)
+// are still considered equivalent for schema evolution purposes
+String schemaA = 
FileIOUtils.readAsUTFString(TestHoodieSchemaComparatorForSchemaEvolution.class.getResourceAsStream("/avro-schema-evo/schema-allshapes-A.txt"));
+String schemaB = 
FileIOUtils.readAsUTFString(TestHoodieSchemaComparatorForSchemaEvolution.class.getResourceAsStream("/avro-schema-evo/schema-allshapes-B.txt"));
+
+HoodieSchema schema1 = HoodieSchema.parse(schemaA);
+HoodieSchema schema2 = HoodieSchema.parse(schemaB);
+assertNotEquals(schema1, schema2);
+assertTrue(HoodieSchemaComparatorForSchemaEvolution.schemaEquals(schema1, 
schema2));
+assertEquals(new 
HoodieSchemaComparatorForSchemaEvolution.SchemaWrapper(schema1),
+new HoodieSchemaComparatorForSchemaEvolution.SchemaWrapper(schema2));
+  }
+
+  @Test
+  void testComparingPrimitiveTypes() {
+// Tests comparison of all primitive types against each other
+// Validates that each primitive type is equal only to other schemas 
sharing the same
+// primitive type.
+HoodieSchemaType[] primitiveTypes = {
+HoodieSchemaType.NULL, HoodieSchemaType.BOOLEAN, HoodieSchemaType.INT,
+HoodieSchemaType.LONG, HoodieSchemaType.FLOAT, HoodieSchemaType.DOUBLE,
+HoodieSchemaType.BYTES, HoodieSchemaType.STRING
+};
+
+for (HoodieSchemaType primitiveType : primitiveTypes) {
+  for (HoodieSchemaType type : primitiveTypes) {
+if (primitiveType == type) {
+  assertTrue(HoodieSchemaComparatorForSchemaEvolution.schemaEquals(
+  HoodieSchema.create(primitiveType),
+  HoodieSchema.create(type)
+  ));
+} else {
+  assertFalse(HoodieSchemaComparatorForSchemaEvolution.schemaEquals(
+  HoodieSchema.create(primitiveType),
+  HoodieSchema.create(type)
+  ), String.format("Types %s and %s should not be equal",
+  primitiveType, type));
+}
+  }
+}
+  }
+
+  @Test
+  void testEqualToSelf() {
+// Validates that a schema is equal to itself
+// Basic sanity check for schema comparison
+String schema = "{\"type\":\"record\",\"name\":\"R\",\"fields\":["
++ "{\"name\":\"field1\",\"type\":\"string\"}]}";
+assertTrue(HoodieSchemaComparatorForSchemaEvolution.schemaEquals(
+HoodieSchema.parse(schema),
+HoodieSchema.parse(schema)
+));
+  }
+
+  @Test
+  void testIsErrorFieldInRecordSchema() {
+// Tests that a record schema is not equal to an error schema
+// even if they have the same structure
+HoodieSchema record1 = HoodieSchema.createRecord("TestRecord", null, null, 
false,
+Arrays.asList(
+HoodieSchemaField.of("field1", 
HoodieSchema.create(HoodieSchemaType.STRING), null, null)
+));
+
+HoodieSchema record2 = HoodieSchema.createRecord("TestRecord", null, null, 
true, // error record
+Arrays.asList(
+HoodieSchemaField.of("field1", 
HoodieSchema.create(HoodieSchemaType.STRING), null, null)
+));
+
+assertFalse(HoodieSc

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


the-other-tim-brown commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2651134779


##
hudi-common/src/test/java/org/apache/hudi/common/schema/TestHoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,505 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import org.apache.hudi.io.util.FileIOUtils;
+
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+import static org.junit.jupiter.api.Assertions.assertEquals;
+import static org.junit.jupiter.api.Assertions.assertFalse;
+import static org.junit.jupiter.api.Assertions.assertNotEquals;
+import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.junit.jupiter.api.Assertions.assertTrue;
+
+class TestHoodieSchemaComparatorForSchemaEvolution {
+  @Test
+  void testAttrsIrrelevantToEquality() throws IOException {
+// Validates that schemas with different non-essential attributes (like 
doc strings or aliases)
+// are still considered equivalent for schema evolution purposes
+String schemaA = 
FileIOUtils.readAsUTFString(TestHoodieSchemaComparatorForSchemaEvolution.class.getResourceAsStream("/avro-schema-evo/schema-allshapes-A.txt"));
+String schemaB = 
FileIOUtils.readAsUTFString(TestHoodieSchemaComparatorForSchemaEvolution.class.getResourceAsStream("/avro-schema-evo/schema-allshapes-B.txt"));
+
+HoodieSchema schema1 = HoodieSchema.parse(schemaA);
+HoodieSchema schema2 = HoodieSchema.parse(schemaB);
+assertNotEquals(schema1, schema2);
+assertTrue(HoodieSchemaComparatorForSchemaEvolution.schemaEquals(schema1, 
schema2));
+assertEquals(new 
HoodieSchemaComparatorForSchemaEvolution.SchemaWrapper(schema1),
+new HoodieSchemaComparatorForSchemaEvolution.SchemaWrapper(schema2));
+  }
+
+  @Test
+  void testComparingPrimitiveTypes() {
+// Tests comparison of all primitive types against each other
+// Validates that each primitive type is equal only to other schemas 
sharing the same
+// primitive type.
+HoodieSchemaType[] primitiveTypes = {
+HoodieSchemaType.NULL, HoodieSchemaType.BOOLEAN, HoodieSchemaType.INT,
+HoodieSchemaType.LONG, HoodieSchemaType.FLOAT, HoodieSchemaType.DOUBLE,
+HoodieSchemaType.BYTES, HoodieSchemaType.STRING
+};
+
+for (HoodieSchemaType primitiveType : primitiveTypes) {
+  for (HoodieSchemaType type : primitiveTypes) {
+if (primitiveType == type) {
+  assertTrue(HoodieSchemaComparatorForSchemaEvolution.schemaEquals(
+  HoodieSchema.create(primitiveType),
+  HoodieSchema.create(type)
+  ));
+} else {
+  assertFalse(HoodieSchemaComparatorForSchemaEvolution.schemaEquals(
+  HoodieSchema.create(primitiveType),
+  HoodieSchema.create(type)
+  ), String.format("Types %s and %s should not be equal",
+  primitiveType, type));
+}
+  }
+}
+  }
+
+  @Test
+  void testEqualToSelf() {
+// Validates that a schema is equal to itself
+// Basic sanity check for schema comparison
+String schema = "{\"type\":\"record\",\"name\":\"R\",\"fields\":["
++ "{\"name\":\"field1\",\"type\":\"string\"}]}";
+assertTrue(HoodieSchemaComparatorForSchemaEvolution.schemaEquals(
+HoodieSchema.parse(schema),
+HoodieSchema.parse(schema)
+));
+  }
+
+  @Test
+  void testIsErrorFieldInRecordSchema() {
+// Tests that a record schema is not equal to an error schema
+// even if they have the same structure
+HoodieSchema record1 = HoodieSchema.createRecord("TestRecord", null, null, 
false,
+Arrays.asList(
+HoodieSchemaField.of("field1", 
HoodieSchema.create(HoodieSchemaType.STRING), null, null)
+));
+
+HoodieSchema record2 = HoodieSchema.createRecord("TestRecord", null, null, 
true, // error record
+Arrays.asList(
+HoodieSchemaField.of("field1", 
HoodieSchema.create(HoodieSchemaType.STRING), null, null)
+));
+
+assertFal

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


the-other-tim-brown commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2651115299


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,371 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {

Review Comment:
   Let's make this private?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-29 Thread via GitHub


the-other-tim-brown commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2651112269


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,371 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {
+  }
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+return primitiveSchemaEquals(s1, s2);
+  case DECIMAL:
+return decimalSchemaEquals(s1, s2);
+  case TIME:
+return timeSchemaEquals(s1, s2);
+  case TIMESTAMP:
+return timestampSchemaEquals(s1, s2);
+  case DATE:
+  case UUID:
+return logicalTypeSchemaEquals(s1, s2);

Review Comment:
   Can these two be moved up to use `primitiveSchemaEquals`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3695491657

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 1235700739b332034cea98850fda187c4e599c41 Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10627)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3695389844

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * a7fd9fa3f96d88237f136a159344986053f8bab9 Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10599)
 
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 1235700739b332034cea98850fda187c4e599c41 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10627)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3695371672

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * a7fd9fa3f96d88237f136a159344986053f8bab9 Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10599)
 
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   * 1235700739b332034cea98850fda187c4e599c41 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2650141517


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,375 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {
+  }
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DECIMAL:
+  case TIME:
+  case TIMESTAMP:
+  case DATE:
+  case UUID:
+return primitiveSchemaEquals(s1, s2);
+  default:
+throw new IllegalArgumentException("Unknown schema type: " + 
s1.getType());
+}
+  }
+
+  protected boolean validateRecord(HoodieSchema s1, HoodieSchema s2) {
+if (s1.isError() != s2.isError()) {
+  return false;
+}
+
+return logicalTypeSchemaEquals(s1, s2);
+  }
+
+  private boolean recordSchemaEquals(HoodieSchema s1, HoodieSchema s2) {
+if (!validateRecord(s1, s2)) {
+  return false;
+}
+
+List fields1 = s1.getFields();
+List fields2 = s2.getFields();
+
+if (fields1.size() != fields2.size()) {
+  return false;
+}
+
+for (int i = 0; i < fields1.size(); i++) {
+  if (!fieldEquals(fields1.get(i), fields2.

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3695369000

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * a7fd9fa3f96d88237f136a159344986053f8bab9 Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10599)
 
   * 7cac8212f647bce559956f4dd2924b5456520f07 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2650139460


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,375 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {
+  }
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DECIMAL:
+  case TIME:
+  case TIMESTAMP:
+  case DATE:
+  case UUID:
+return primitiveSchemaEquals(s1, s2);
+  default:
+throw new IllegalArgumentException("Unknown schema type: " + 
s1.getType());
+}
+  }
+
+  protected boolean validateRecord(HoodieSchema s1, HoodieSchema s2) {
+if (s1.isError() != s2.isError()) {
+  return false;
+}
+
+return logicalTypeSchemaEquals(s1, s2);
+  }
+
+  private boolean recordSchemaEquals(HoodieSchema s1, HoodieSchema s2) {
+if (!validateRecord(s1, s2)) {
+  return false;
+}
+
+List fields1 = s1.getFields();
+List fields2 = s2.getFields();
+
+if (fields1.size() != fields2.size()) {
+  return false;
+}
+
+for (int i = 0; i < fields1.size(); i++) {
+  if (!fieldEquals(fields1.get(i), fields2.

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


the-other-tim-brown commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2650135153


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,375 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {
+  }
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DECIMAL:
+  case TIME:
+  case TIMESTAMP:
+  case DATE:
+  case UUID:
+return primitiveSchemaEquals(s1, s2);
+  default:
+throw new IllegalArgumentException("Unknown schema type: " + 
s1.getType());
+}
+  }
+
+  protected boolean validateRecord(HoodieSchema s1, HoodieSchema s2) {
+if (s1.isError() != s2.isError()) {
+  return false;
+}
+
+return logicalTypeSchemaEquals(s1, s2);
+  }
+
+  private boolean recordSchemaEquals(HoodieSchema s1, HoodieSchema s2) {
+if (!validateRecord(s1, s2)) {
+  return false;
+}
+
+List fields1 = s1.getFields();
+List fields2 = s2.getFields();
+
+if (fields1.size() != fields2.size()) {
+  return false;
+}
+
+for (int i = 0; i < fields1.size(); i++) {
+  if (!fieldEquals(fields1.get(i

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2650116432


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,375 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {
+  }
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DECIMAL:
+  case TIME:
+  case TIMESTAMP:

Review Comment:
   Dome



##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,375 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distri

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2650102815


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,375 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {
+  }
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DECIMAL:
+  case TIME:
+  case TIMESTAMP:

Review Comment:
   Good catch, let me add that in and replace `logicalTypeEquals` accordingly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-28 Thread via GitHub


the-other-tim-brown commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2649832809


##
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchemaComparatorForSchemaEvolution.java:
##
@@ -0,0 +1,375 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.common.schema;
+
+import java.util.List;
+import java.util.Set;
+import java.util.stream.Collectors;
+
+/**
+ * Defines equality comparison rules for HoodieSchema schemas for schema 
evolution purposes.
+ *
+ * This class provides schema comparison logic that focuses only on 
attributes that affect
+ * data readers/writers, ignoring metadata like documentation, namespace, and 
aliases which
+ * don't impact schema evolution compatibility.
+ *
+ * Common Rules Across All Types
+ * Included in equality check:
+ * 
+ *   Name/identifier
+ *   Type including primitive type, complex type (see below), and logical 
type
+ * 
+ * Excluded from equality check:
+ * 
+ *   Namespace
+ *   Documentation
+ *   Aliases
+ *   Custom properties
+ * 
+ *
+ * Type-Specific Rules
+ *
+ * Record
+ * Included:
+ * 
+ *   Field names
+ *   Field types
+ *   Field order attribute
+ *   Default values
+ * 
+ * Excluded:
+ * 
+ *   Field documentation
+ *   Field aliases
+ * 
+ *
+ * Enum
+ * Included:
+ * 
+ *   Name
+ *   Symbol order
+ *   Symbol value
+ * 
+ * Excluded:
+ * 
+ *   Custom properties
+ * 
+ *
+ * Array
+ * Included:
+ * 
+ *   Items schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Map
+ * Included:
+ * 
+ *   Values schema
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ *
+ * Fixed
+ * Included:
+ * 
+ *   Size
+ *   Name
+ * 
+ * Excluded:
+ * 
+ *   Namespace
+ *   Aliases
+ * 
+ *
+ * Union
+ * Included:
+ * 
+ *   Member types
+ * 
+ * Excluded:
+ * 
+ *   Member order
+ * 
+ *
+ * Logical Types
+ * Included:
+ * 
+ *   Logical type name (via schema subclass)
+ *   Underlying primitive type
+ *   Decimal precision/scale (if applicable)
+ *   Timestamp/Time precision (if applicable)
+ * 
+ * Excluded:
+ * 
+ *   Documentation
+ *   Custom properties
+ * 
+ */
+public class HoodieSchemaComparatorForSchemaEvolution {
+
+  protected HoodieSchemaComparatorForSchemaEvolution() {
+  }
+
+  private static final HoodieSchemaComparatorForSchemaEvolution VALIDATOR = 
new HoodieSchemaComparatorForSchemaEvolution();
+
+  public static boolean schemaEquals(HoodieSchema s1, HoodieSchema s2) {
+return VALIDATOR.schemaEqualsInternal(s1, s2);
+  }
+
+  protected boolean schemaEqualsInternal(HoodieSchema s1, HoodieSchema s2) {
+if (s1 == s2) {
+  return true;
+}
+if (s1 == null || s2 == null) {
+  return false;
+}
+if (s1.getType() != s2.getType()) {
+  return false;
+}
+
+switch (s1.getType()) {
+  case RECORD:
+return recordSchemaEquals(s1, s2);
+  case ENUM:
+return enumSchemaEquals(s1, s2);
+  case ARRAY:
+return arraySchemaEquals(s1, s2);
+  case MAP:
+return mapSchemaEquals(s1, s2);
+  case FIXED:
+return fixedSchemaEquals(s1, s2);
+  case UNION:
+return unionSchemaEquals(s1, s2);
+  case STRING:
+  case BYTES:
+  case INT:
+  case LONG:
+  case FLOAT:
+  case DOUBLE:
+  case BOOLEAN:
+  case NULL:
+  case DECIMAL:
+  case TIME:
+  case TIMESTAMP:

Review Comment:
   I think we should break up the logical type method now that we can handle 
the types in the switch statement more easily and perform the checks here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-27 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3693892488

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * a7fd9fa3f96d88237f136a159344986053f8bab9 Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10599)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-27 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3693842853

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 89da7651679c8eb9a539b60af960b485f7671bbf Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10597)
 
   * a7fd9fa3f96d88237f136a159344986053f8bab9 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10599)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-27 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3693828261

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 89da7651679c8eb9a539b60af960b485f7671bbf Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10597)
 
   * a7fd9fa3f96d88237f136a159344986053f8bab9 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-26 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3693699053

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 89da7651679c8eb9a539b60af960b485f7671bbf Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10597)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-26 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3693674283

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 24b1721f45482b0ea22a14591b3de643dac9d17b Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10527)
 
   * 89da7651679c8eb9a539b60af960b485f7671bbf Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10597)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-26 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3693672824

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 24b1721f45482b0ea22a14591b3de643dac9d17b Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10527)
 
   * 89da7651679c8eb9a539b60af960b485f7671bbf UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-24 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3690493571

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 24b1721f45482b0ea22a14591b3de643dac9d17b Azure: 
[SUCCESS](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10527)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-24 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3690374380

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 762b1e3f52b9a445f735fe65f035e22540ad0ac9 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10486)
 
   * 24b1721f45482b0ea22a14591b3de643dac9d17b Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10527)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-24 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3690368669

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 762b1e3f52b9a445f735fe65f035e22540ad0ac9 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10486)
 
   * 24b1721f45482b0ea22a14591b3de643dac9d17b UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3687889551

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 762b1e3f52b9a445f735fe65f035e22540ad0ac9 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10486)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3687711762

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 5a8eb8bf2cd40d4bcdf68810e908876bcc9ad31c Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10479)
 
   * 762b1e3f52b9a445f735fe65f035e22540ad0ac9 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10486)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3687676878

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 5a8eb8bf2cd40d4bcdf68810e908876bcc9ad31c Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10479)
 
   * 762b1e3f52b9a445f735fe65f035e22540ad0ac9 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3687432214

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 5a8eb8bf2cd40d4bcdf68810e908876bcc9ad31c Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10479)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3687210459

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * b8b088ecf9841a9e6d10a5603ab88a962b241244 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10477)
 
   * 5a8eb8bf2cd40d4bcdf68810e908876bcc9ad31c Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10479)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3687204754

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 5f6daa515a0064d20d5f186fd36e7afd402322f8 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10398)
 
   * b8b088ecf9841a9e6d10a5603ab88a962b241244 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10477)
 
   * 5a8eb8bf2cd40d4bcdf68810e908876bcc9ad31c Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10479)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3687088391

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 5f6daa515a0064d20d5f186fd36e7afd402322f8 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10398)
 
   * b8b088ecf9841a9e6d10a5603ab88a962b241244 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10477)
 
   * 5a8eb8bf2cd40d4bcdf68810e908876bcc9ad31c UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643583351


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +149,38 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);
+  HoodieSchema fieldSchema = nestedFieldOpt.get().getRight().schema();
+  return fieldSchema.getType() != HoodieSchemaType.RECORD && 
fieldSchema.getType() != HoodieSchemaType.ARRAY && fieldSchema.getType() != 
HoodieSchemaType.MAP;
 });
   }
 
   /**
* Check if the given schema type is supported for secondary index.
* Supported types are: String (including CHAR), Integer types (Int, BigInt, 
Long, Short), and timestamp
*/
-  private static boolean isSecondaryIndexSupportedType(Schema schema) {
+  private static boolean isSecondaryIndexSupportedType(HoodieSchema schema) {
 // Handle union types (nullable fields)
-if (schema.getType() == Schema.Type.UNION) {
+if (schema.getType() == HoodieSchemaType.UNION) {
   // For union types, check if any of the types is supported
   return schema.getTypes().stream()
-  .anyMatch(s -> s.getType() != Schema.Type.NULL && 
isSecondaryIndexSupportedType(s));
+  .anyMatch(s -> s.getType() != HoodieSchemaType.NULL && 
isSecondaryIndexSupportedType(s));
 }
 
 // Check basic types
 switch (schema.getType()) {
   case STRING:
-// STRING type can have UUID logical type which we don't support
-return schema.getLogicalType() == null; // UUID and other string-based 
logical types are not supported
-  // Regular STRING (includes CHAR)
   case INT:
-// INT type can represent regular integers or dates/times with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support date and time-millis logical types
-  return schema.getLogicalType() == LogicalTypes.date()
-  || schema.getLogicalType() == LogicalTypes.timeMillis();
-}
-return true; // Regular INT
   case LONG:
-// LONG type can represent regular longs or timestamps with logical 
types
-if (schema.getLogicalType() != null) {
-  // Support timestamp logical types
-  return schema.getLogicalType() == LogicalTypes.timestampMillis()
-  || schema.getLogicalType() == LogicalTypes.timestampMicros()
-  || schema.getLogicalType() == LogicalTypes.timeMicros();
-}
-return true; // Regular LONG
   case DOUBLE:
-return true; // Support DOUBLE type
+  case DATE:
+  case TIME:
+return true;
+  case TIMESTAMP:
+// LOCAL timestamps are not supported

Review Comment:
   Yeap, we can. I'm just transferring the test expectations to actual code 
since i don't recall seeing it documented anywhere other than tests here. (Have 
tagged u separately for this)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643582770


##
hudi-client/hudi-client-common/src/test/java/org/apache/hudi/index/TestHoodieIndexUtils.java:
##
@@ -227,32 +218,31 @@ public void testValidateDataTypeForSecondaryIndex() {
   @Test
   public void testValidateDataTypeForSecondaryIndexWithLogicalTypes() {
 // Supported logical types
-Schema timestampMillis = 
LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
-Schema timestampMicros = 
LogicalTypes.timestampMicros().addToSchema(Schema.create(Schema.Type.LONG));
-Schema date = 
LogicalTypes.date().addToSchema(Schema.create(Schema.Type.INT));
-Schema timeMillis = 
LogicalTypes.timeMillis().addToSchema(Schema.create(Schema.Type.INT));
-Schema timeMicros = 
LogicalTypes.timeMicros().addToSchema(Schema.create(Schema.Type.LONG));
-
+HoodieSchema timestampMillis = HoodieSchema.createTimestampMillis();
+HoodieSchema timestampMicros = HoodieSchema.createTimestampMicros();
+HoodieSchema date = HoodieSchema.createDate();
+HoodieSchema timeMillis = HoodieSchema.createTimeMillis();
+HoodieSchema timeMicros = HoodieSchema.createTimeMicros();
+
 // Unsupported logical types
-Schema decimal = LogicalTypes.decimal(10, 
2).addToSchema(Schema.create(Schema.Type.BYTES));
-Schema uuid = 
LogicalTypes.uuid().addToSchema(Schema.create(Schema.Type.STRING));
-Schema localTimestampMillis = 
LogicalTypes.localTimestampMillis().addToSchema(Schema.create(Schema.Type.LONG));
-Schema localTimestampMicros = 
LogicalTypes.localTimestampMicros().addToSchema(Schema.create(Schema.Type.LONG));

Review Comment:
   @the-other-tim-brown Original tests where it expects local-timestamp to be 
unsupported.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643567911


##
hudi-common/src/main/java/org/apache/hudi/avro/AvroRecordContext.java:
##
@@ -70,22 +70,20 @@ public AvroRecordContext() {
   public static Object getFieldValueFromIndexedRecord(
   IndexedRecord record,
   String fieldName) {
-Schema currentSchema = record.getSchema();
+HoodieSchema currentSchema = 
HoodieSchema.fromAvroSchema(record.getSchema());
 IndexedRecord currentRecord = record;
 String[] path = fieldName.split("\\.");
 for (int i = 0; i < path.length; i++) {
-  if (currentSchema.isUnion()) {
-currentSchema = AvroSchemaUtils.getNonNullTypeFromUnion(currentSchema);
-  }
-  Schema.Field field = currentSchema.getField(path[i]);
-  if (field == null) {
+  currentSchema = currentSchema.getNonNullType();
+  Option fieldOpt = currentSchema.getField(path[i]);
+  if (fieldOpt.isEmpty()) {
 return null;
   }
-  Object value = currentRecord.get(field.pos());
+  Object value = currentRecord.get(fieldOpt.get().pos());

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643568881


##
hudi-hadoop-common/src/main/java/org/apache/hudi/common/util/AvroOrcUtils.java:
##
@@ -810,74 +812,65 @@ private static Schema getActualSchemaType(Schema 
unionSchema) {
 }
   }
 
-  public static Schema createAvroSchemaWithDefaultValue(TypeDescription 
orcSchema, String recordName, String namespace, boolean nullable) {
-Schema avroSchema = 
createAvroSchemaWithNamespace(orcSchema,recordName,namespace);
-List fields = new ArrayList();
-List fieldList = avroSchema.getFields();
-for (Field field : fieldList) {
-  Schema fieldSchema = field.schema();
-  Schema nullableSchema = 
Schema.createUnion(Schema.create(Schema.Type.NULL),fieldSchema);
+  public static HoodieSchema createSchemaWithDefaultValue(TypeDescription 
orcSchema, String recordName, String namespace, boolean nullable) {
+HoodieSchema hoodieSchema = 
createSchemaWithNamespace(orcSchema,recordName,namespace);
+List fields = new ArrayList<>();

Review Comment:
   Done!,



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643565949


##
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/bootstrap/HoodieSparkBootstrapSchemaProvider.java:
##
@@ -85,10 +85,10 @@ private static Schema 
getBootstrapSourceSchemaParquet(HoodieWriteConfig writeCon
 String structName = tableName + "_record";
 String recordNamespace = "hoodie." + tableName;
 
-return AvroConversionUtils.convertStructTypeToAvroSchema(parquetSchema, 
structName, recordNamespace);
+return 
HoodieSchema.fromAvroSchema(AvroConversionUtils.convertStructTypeToAvroSchema(parquetSchema,
 structName, recordNamespace));

Review Comment:
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643560184


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -136,10 +135,10 @@ public static List 
getLatestBaseFilesForPartition(String partiti
* @param tableSchema  table schema
* @return true if each field's data type are supported for secondary index, 
false otherwise
*/
-  static boolean validateDataTypeForSecondaryIndex(List sourceFields, 
Schema tableSchema) {
+  static boolean validateDataTypeForSecondaryIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().allMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return isSecondaryIndexSupportedType(schema);
+  Option> schema = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);

Review Comment:
   I'd opt for throwing an error like what the original 
`HoodieAvroUtils#createHoodieWriteSchema` does.  similar to the comment for 
line 154.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643559024


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +149,38 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);

Review Comment:
   Done!



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java:
##
@@ -341,15 +341,15 @@ private void saveInternalSchema(HoodieTable table, String 
instantTime, HoodieCom
 FileBasedInternalSchemaStorageManager schemasManager = new 
FileBasedInternalSchemaStorageManager(table.getMetaClient());
 if (!historySchemaStr.isEmpty() || 
Boolean.parseBoolean(config.getString(HoodieCommonConfig.RECONCILE_SCHEMA.key(
 {
   InternalSchema internalSchema;
-  Schema avroSchema = 
HoodieAvroUtils.createHoodieWriteSchema(config.getSchema(), 
config.allowOperationMetadataField());
+  HoodieSchema schema = 
HoodieSchemaUtils.addMetadataFields(HoodieSchema.parse(config.getSchema()), 
config.allowOperationMetadataField());

Review Comment:
   I'd opt for throwing an error like what the original 
`HoodieAvroUtils#createHoodieWriteSchema` does. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643555351


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java:
##
@@ -341,15 +341,15 @@ private void saveInternalSchema(HoodieTable table, String 
instantTime, HoodieCom
 FileBasedInternalSchemaStorageManager schemasManager = new 
FileBasedInternalSchemaStorageManager(table.getMetaClient());
 if (!historySchemaStr.isEmpty() || 
Boolean.parseBoolean(config.getString(HoodieCommonConfig.RECONCILE_SCHEMA.key(
 {
   InternalSchema internalSchema;
-  Schema avroSchema = 
HoodieAvroUtils.createHoodieWriteSchema(config.getSchema(), 
config.allowOperationMetadataField());
+  HoodieSchema schema = 
HoodieSchemaUtils.addMetadataFields(HoodieSchema.parse(config.getSchema()), 
config.allowOperationMetadataField());

Review Comment:
   I'd opt for throwing an error like what the original 
`HoodieAvroUtils#createHoodieWriteSchema` does. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643546857


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -136,10 +135,10 @@ public static List 
getLatestBaseFilesForPartition(String partiti
* @param tableSchema  table schema
* @return true if each field's data type are supported for secondary index, 
false otherwise
*/
-  static boolean validateDataTypeForSecondaryIndex(List sourceFields, 
Schema tableSchema) {
+  static boolean validateDataTypeForSecondaryIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().allMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return isSecondaryIndexSupportedType(schema);
+  Option> schema = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);

Review Comment:
   Yeap, no harm being more defensive here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643540177


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java:
##
@@ -341,15 +341,15 @@ private void saveInternalSchema(HoodieTable table, String 
instantTime, HoodieCom
 FileBasedInternalSchemaStorageManager schemasManager = new 
FileBasedInternalSchemaStorageManager(table.getMetaClient());
 if (!historySchemaStr.isEmpty() || 
Boolean.parseBoolean(config.getString(HoodieCommonConfig.RECONCILE_SCHEMA.key(
 {
   InternalSchema internalSchema;
-  Schema avroSchema = 
HoodieAvroUtils.createHoodieWriteSchema(config.getSchema(), 
config.allowOperationMetadataField());
+  HoodieSchema schema = 
HoodieSchemaUtils.addMetadataFields(HoodieSchema.parse(config.getSchema()), 
config.allowOperationMetadataField());

Review Comment:
   Yeap! Defn!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643533136


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/ConcurrentSchemaEvolutionTableSchemaGetter.java:
##
@@ -89,12 +89,16 @@ private HoodieSchema 
handlePartitionColumnsIfNeeded(HoodieSchema schema) {
 return schema;
   }
 
-  public Option getTableAvroSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {
+  public Option getTableSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {
 return getTableAvroSchemaFromTimelineWithCache(instant) // Get table 
schema from schema evolution timeline.
 .map(HoodieSchema::fromAvroSchema)
 .or(this::getTableCreateSchemaWithoutMetaField) // Fall back: read 
create schema from table config.
 .map(tableSchema -> includeMetadataFields ? 
HoodieSchemaUtils.addMetadataFields(tableSchema, false) : 
HoodieSchemaUtils.removeMetadataFields(tableSchema))
-.map(this::handlePartitionColumnsIfNeeded)
+.map(this::handlePartitionColumnsIfNeeded);
+  }
+
+  public Option getTableAvroSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {

Review Comment:
   Will also do a Avro.Schema -> HoodieSchema migration for this class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3686996934

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 5f6daa515a0064d20d5f186fd36e7afd402322f8 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10398)
 
   * b8b088ecf9841a9e6d10a5603ab88a962b241244 Azure: 
[PENDING](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10477)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643520096


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/ConcurrentSchemaEvolutionTableSchemaGetter.java:
##
@@ -89,12 +89,16 @@ private HoodieSchema 
handlePartitionColumnsIfNeeded(HoodieSchema schema) {
 return schema;
   }
 
-  public Option getTableAvroSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {
+  public Option getTableSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {
 return getTableAvroSchemaFromTimelineWithCache(instant) // Get table 
schema from schema evolution timeline.
 .map(HoodieSchema::fromAvroSchema)
 .or(this::getTableCreateSchemaWithoutMetaField) // Fall back: read 
create schema from table config.
 .map(tableSchema -> includeMetadataFields ? 
HoodieSchemaUtils.addMetadataFields(tableSchema, false) : 
HoodieSchemaUtils.removeMetadataFields(tableSchema))
-.map(this::handlePartitionColumnsIfNeeded)
+.map(this::handlePartitionColumnsIfNeeded);
+  }
+
+  public Option getTableAvroSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {

Review Comment:
   Yes, will remove this directly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2643520096


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/ConcurrentSchemaEvolutionTableSchemaGetter.java:
##
@@ -89,12 +89,16 @@ private HoodieSchema 
handlePartitionColumnsIfNeeded(HoodieSchema schema) {
 return schema;
   }
 
-  public Option getTableAvroSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {
+  public Option getTableSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {
 return getTableAvroSchemaFromTimelineWithCache(instant) // Get table 
schema from schema evolution timeline.
 .map(HoodieSchema::fromAvroSchema)
 .or(this::getTableCreateSchemaWithoutMetaField) // Fall back: read 
create schema from table config.
 .map(tableSchema -> includeMetadataFields ? 
HoodieSchemaUtils.addMetadataFields(tableSchema, false) : 
HoodieSchemaUtils.removeMetadataFields(tableSchema))
-.map(this::handlePartitionColumnsIfNeeded)
+.map(this::handlePartitionColumnsIfNeeded);
+  }
+
+  public Option getTableAvroSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {

Review Comment:
   Yes, will do. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3686897647

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 5f6daa515a0064d20d5f186fd36e7afd402322f8 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10398)
 
   * b8b088ecf9841a9e6d10a5603ab88a962b241244 UNKNOWN
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-23 Thread via GitHub


voonhous commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3686897886

   Note: this is a stacked PR, the base of this needs to be modified after 
https://github.com/apache/hudi/pull/17581 is merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-21 Thread via GitHub


the-other-tim-brown commented on code in PR #17599:
URL: https://github.com/apache/hudi/pull/17599#discussion_r2638178932


##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java:
##
@@ -341,15 +341,15 @@ private void saveInternalSchema(HoodieTable table, String 
instantTime, HoodieCom
 FileBasedInternalSchemaStorageManager schemasManager = new 
FileBasedInternalSchemaStorageManager(table.getMetaClient());
 if (!historySchemaStr.isEmpty() || 
Boolean.parseBoolean(config.getString(HoodieCommonConfig.RECONCILE_SCHEMA.key(
 {
   InternalSchema internalSchema;
-  Schema avroSchema = 
HoodieAvroUtils.createHoodieWriteSchema(config.getSchema(), 
config.allowOperationMetadataField());
+  HoodieSchema schema = 
HoodieSchemaUtils.addMetadataFields(HoodieSchema.parse(config.getSchema()), 
config.allowOperationMetadataField());

Review Comment:
   Should we just use the `HoodieSchemaUtils#createHoodieWriteSchema` here?



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -136,10 +135,10 @@ public static List 
getLatestBaseFilesForPartition(String partiti
* @param tableSchema  table schema
* @return true if each field's data type are supported for secondary index, 
false otherwise
*/
-  static boolean validateDataTypeForSecondaryIndex(List sourceFields, 
Schema tableSchema) {
+  static boolean validateDataTypeForSecondaryIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().allMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return isSecondaryIndexSupportedType(schema);
+  Option> schema = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);

Review Comment:
   If the option is empty, should we return false here?



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/ConcurrentSchemaEvolutionTableSchemaGetter.java:
##
@@ -89,12 +89,16 @@ private HoodieSchema 
handlePartitionColumnsIfNeeded(HoodieSchema schema) {
 return schema;
   }
 
-  public Option getTableAvroSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {
+  public Option getTableSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {
 return getTableAvroSchemaFromTimelineWithCache(instant) // Get table 
schema from schema evolution timeline.
 .map(HoodieSchema::fromAvroSchema)
 .or(this::getTableCreateSchemaWithoutMetaField) // Fall back: read 
create schema from table config.
 .map(tableSchema -> includeMetadataFields ? 
HoodieSchemaUtils.addMetadataFields(tableSchema, false) : 
HoodieSchemaUtils.removeMetadataFields(tableSchema))
-.map(this::handlePartitionColumnsIfNeeded)
+.map(this::handlePartitionColumnsIfNeeded);
+  }
+
+  public Option getTableAvroSchemaIfPresent(boolean 
includeMetadataFields, Option instant) {

Review Comment:
   Should we mark this as deprecated?



##
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/HoodieIndexUtils.java:
##
@@ -150,50 +149,38 @@ static boolean 
validateDataTypeForSecondaryIndex(List sourceFields, Sche
* @param tableSchema  table schema
* @return true if each field's data types are supported, false otherwise
*/
-  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, Schema 
tableSchema) {
+  public static boolean 
validateDataTypeForSecondaryOrExpressionIndex(List sourceFields, 
HoodieSchema tableSchema) {
 return sourceFields.stream().anyMatch(fieldToIndex -> {
-  Schema schema = getNestedFieldSchemaFromWriteSchema(tableSchema, 
fieldToIndex);
-  return schema.getType() != Schema.Type.RECORD && schema.getType() != 
Schema.Type.ARRAY && schema.getType() != Schema.Type.MAP;
+  Option> nestedFieldOpt = 
HoodieSchemaUtils.getNestedField(tableSchema, fieldToIndex);

Review Comment:
   Let's throw an exception if the option is not present?



##
hudi-hadoop-common/src/main/java/org/apache/hudi/common/util/AvroOrcUtils.java:
##
@@ -810,74 +812,65 @@ private static Schema getActualSchemaType(Schema 
unionSchema) {
 }
   }
 
-  public static Schema createAvroSchemaWithDefaultValue(TypeDescription 
orcSchema, String recordName, String namespace, boolean nullable) {
-Schema avroSchema = 
createAvroSchemaWithNamespace(orcSchema,recordName,namespace);
-List fields = new ArrayList();
-List fieldList = avroSchema.getFields();
-for (Field field : fieldList) {
-  Schema fieldSchema = field.schema();
-  Schema nullableSchema = 
Schema.createUnion(Schema.create(Schema.Type.NULL),fieldSchema);
+  public static HoodieSchema createSchemaWithDefaultValue(TypeDescription 
orcSchema, String recordName, String namespace, boolean nullable) {
+HoodieSchema hoodieSchema = 
createSchemaWithNamespace(orcSchema,

Re: [PR] feat(schema): Phase 18 - HoodieAvroUtils removal (Part 1) [hudi]

2025-12-20 Thread via GitHub


hudi-bot commented on PR #17599:
URL: https://github.com/apache/hudi/pull/17599#issuecomment-3678048583

   
   ## CI report:
   
   * 9df0e72b27b6c2aad3ca2976cfbf703dd0ddb7ea UNKNOWN
   * 5f6daa515a0064d20d5f186fd36e7afd402322f8 Azure: 
[FAILURE](https://dev.azure.com/apachehudi/a1a51da7-8592-47d4-88dc-fd67bed336bb/_build/results?buildId=10398)
 
   
   
   Bot commands
 @hudi-bot supports the following commands:
   
- `@hudi-bot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



  1   2   >