[GitHub] [hudi] umehrot2 commented on a diff in pull request #6155: [HUDI-4435] Fix Avro field not found issue introduced by Avro 1.10.2

GitBox Fri, 22 Jul 2022 13:22:23 -0700


umehrot2 commented on code in PR #6155:
URL: https://github.com/apache/hudi/pull/6155#discussion_r927965296



##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeRecordReaderUtils.java:
##########
@@ -189,7 +190,13 @@ public static Writable avroToArrayWritable(Object value, 
Schema schema) {
         Writable[] recordValues = new Writable[schema.getFields().size()];
         int recordValueIndex = 0;
         for (Schema.Field field : schema.getFields()) {
-          recordValues[recordValueIndex++] = 
avroToArrayWritable(record.get(field.name()), field.schema());
+          Object fieldValue = null;
+          try {
+            fieldValue = record.get(field.name());
+          } catch (AvroRuntimeException e) {
+            LOG.debug("Field:" + field.name() + "not found in Schema:" + 
schema.toString());

Review Comment:
   @yihua The way things are currently implemented, is that this function is 
supposed to return a record with complete schema. We cannot fail if the field 
is not found, as it is required for both bootstrap and schema evolution 
scenarios. In case of bootstrap, the metadata fields may not be found in the 
data file and need to be filled with nulls. Similarly with schema evolution, we 
can hit a scenario like this and historically what we do is return nulls for 
the new columns if the old record does not have them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] umehrot2 commented on a diff in pull request #6155: [HUDI-4435] Fix Avro field not found issue introduced by Avro 1.10.2

Reply via email to