the-other-tim-brown commented on code in PR #14340:
URL: https://github.com/apache/hudi/pull/14340#discussion_r2596695958


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieHFileDataBlock.java:
##########
@@ -115,8 +114,8 @@ protected <T> ClosableIterator<HoodieRecord<T>> 
deserializeRecords(byte[] conten
         .getReaderFactory(HoodieRecordType.AVRO)
         .getContentReader(ConfigUtils.DEFAULT_HUDI_CONFIG_FOR_READER, 
pathForReader, HoodieFileFormat.HFILE,
             //TODO boundary to revisit in later pr to use HoodieSchema directly

Review Comment:
   Let's remove this comment along with the other "boundary" comments in this 
file



##########
hudi-hadoop-common/src/test/java/org/apache/hudi/common/model/TestHoodieAvroIndexedRecord.java:
##########
@@ -36,21 +39,22 @@ class TestHoodieAvroIndexedRecord {
 
   @Test
   void testIsBuiltInDelete() {
-    Schema schema = SchemaBuilder.record("TestRecord")
-        .fields()
-        .optionalBoolean("_hoodie_is_deleted")
-        .requiredString("field2")
-        .endRecord();
-    GenericRecord record1 = new GenericRecordBuilder(schema)
+    HoodieSchema hoodieSchema = HoodieSchema.createRecord("TestRecord", null, 
null,
+        Arrays.asList(
+            HoodieSchemaField.of("_hoodie_is_deleted",
+                
HoodieSchema.createNullable(HoodieSchema.create(HoodieSchemaType.BOOLEAN)), 
null, HoodieSchema.NULL_VALUE),

Review Comment:
   The createNullable also takes in a HoodieSchemaType to make test code like 
this less verbose



##########
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/TestHoodieFileGroupReaderOnFlink.java:
##########
@@ -204,13 +208,14 @@ public void testGetOrderingValue() {
     when(tableConfig.populateMetaFields()).thenReturn(true);
     FlinkRowDataReaderContext readerContext =
         new FlinkRowDataReaderContext(getStorageConf(), () -> 
InternalSchemaManager.DISABLED, Collections.emptyList(), tableConfig, 
Option.empty());
-    Schema schema = SchemaBuilder.builder()
-        .record("test")
-        .fields()
-        .requiredString("field1")
-        .optionalString("field2")
-        .optionalLong("ts")
-        .endRecord();
+    HoodieSchema schema = HoodieSchema.createRecord("test", null, null,
+        Arrays.asList(
+            HoodieSchemaField.of("field1", 
HoodieSchema.create(HoodieSchemaType.STRING)),
+            HoodieSchemaField.of("field2", 
HoodieSchema.createUnion(HoodieSchema.create(HoodieSchemaType.NULL), 
HoodieSchema.create(HoodieSchemaType.STRING)),

Review Comment:
   We could simplify with `createNullable` here?



##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/DeleteContext.java:
##########
@@ -94,15 +95,19 @@ private static Option<Pair<String, String>> 
getCustomDeleteMarkerKeyValue(Proper
    * @param schema table schema to check
    * @return whether built-in delete field is included in the table schema
    */
-  private static boolean hasBuiltInDeleteField(Schema schema) {
-    return schema.getType() != Schema.Type.NULL && 
schema.getField(HOODIE_IS_DELETED_FIELD) != null;
+  private static boolean hasBuiltInDeleteField(HoodieSchema schema) {
+    return schema.getType() != HoodieSchemaType.NULL && 
schema.getField(HOODIE_IS_DELETED_FIELD).isPresent();
   }
 
   /**
    * Returns position of hoodie operation meta field in the schema
    */
-  private static int getHoodieOperationPos(Schema schema) {
-    return 
Option.ofNullable(schema.getField(HoodieRecord.OPERATION_METADATA_FIELD)).map(Schema.Field::pos).orElse(-1);
+  private static int getHoodieOperationPos(HoodieSchema schema) {
+    return 
Option.ofNullable(schema.getField(HoodieRecord.OPERATION_METADATA_FIELD))

Review Comment:
   We don't need the initial `Option.ofNullable` here since the `getField` 
returns an option now instead of a null value



##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieDataBlock.java:
##########
@@ -362,24 +363,23 @@ protected abstract <T> ClosableIterator<HoodieRecord<T>> 
deserializeRecords(
 
   public abstract HoodieLogBlockType getBlockType();
 
-  protected Option<Schema.Field> getKeyField(Schema schema) {
-    return Option.ofNullable(schema.getField(keyFieldName));
-  }
-
   protected Option<String> getRecordKey(HoodieRecord record) {
-    return Option.ofNullable(record.getRecordKey(readerSchema, keyFieldName));
+    return Option.ofNullable(record.getRecordKey(readerSchema.toAvroSchema(), 
keyFieldName));
   }
 
-  protected Schema getSchemaFromHeader() {
+  protected HoodieSchema getSchemaFromHeader() {
     String schemaStr = getLogBlockHeader().get(HeaderMetadataType.SCHEMA);
     SCHEMA_MAP.computeIfAbsent(schemaStr,
         (schemaString) -> {
           try {
-            return new Schema.Parser().parse(schemaStr);
-          } catch (AvroTypeException e) {
+            return HoodieSchema.parse(schemaStr);
+          } catch (HoodieAvroSchemaException e) {
             // Archived commits from earlier hudi versions fail the schema 
check
-            // So we retry in this one specific instance.
-            return new 
Schema.Parser().setValidateDefaults(false).parse(schemaStr);
+            // So we retry in this one specific instance with validation 
disabled

Review Comment:
   Is there harm in retrying on any HoodieAvroSchemaException?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to