umehrot2 commented on code in PR #6155:
URL: https://github.com/apache/hudi/pull/6155#discussion_r927965296
##########
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HoodieRealtimeRecordReaderUtils.java:
##########
@@ -189,7 +190,13 @@ public static Writable avroToArrayWritable(Object value,
Schema schema) {
Writable[] recordValues = new Writable[schema.getFields().size()];
int recordValueIndex = 0;
for (Schema.Field field : schema.getFields()) {
- recordValues[recordValueIndex++] =
avroToArrayWritable(record.get(field.name()), field.schema());
+ Object fieldValue = null;
+ try {
+ fieldValue = record.get(field.name());
+ } catch (AvroRuntimeException e) {
+ LOG.debug("Field:" + field.name() + "not found in Schema:" +
schema.toString());
Review Comment:
@yihua The way things are currently implemented, is that this function is
supposed to return a record with complete schema. We cannot fail if the field
is not found, as it is required for both bootstrap and schema evolution
scenarios. In case of bootstrap, the metadata fields may not be found in the
data file and need to be filled with nulls. Similarly with schema evolution, we
can hit a scenario like this and historically what we do is return nulls for
the new columns if the old record does not have them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]