pratyakshsharma commented on a change in pull request #2424:
URL: https://github.com/apache/hudi/pull/2424#discussion_r556745181
##########
File path: hudi-common/src/main/java/org/apache/hudi/avro/HoodieAvroUtils.java
##########
@@ -292,53 +292,53 @@ public static GenericRecord stitchRecords(GenericRecord
left, GenericRecord righ
return result;
}
- /**
- * Given a avro record with a given schema, rewrites it into the new schema
while setting fields only from the old
- * schema.
- */
- public static GenericRecord rewriteRecord(GenericRecord record, Schema
newSchema) {
- return rewrite(record, getCombinedFieldsToWrite(record.getSchema(),
newSchema), newSchema);
- }
-
/**
* Given a avro record with a given schema, rewrites it into the new schema
while setting fields only from the new
* schema.
+ * NOTE: Here, the assumption is that you cannot go from an evolved schema
(schema with (N) fields)
+ * to an older schema (schema with (N-1) fields). All fields present in the
older record schema MUST be present in the
+ * new schema and the default/existing values are carried over.
+ * This particular method does the following things :
+ * a) Create a new empty GenericRecord with the new schema.
+ * b) For GenericRecord, copy over the data from the old schema to the new
schema or set default values for all fields of this
+ * transformed schema
+ * c) For SpecificRecord, hoodie_metadata_fields have a special treatment.
This is done because for code generated
+ * AVRO classes (HoodieMetadataRecord), the avro record is a
SpecificBaseRecord type instead of a GenericRecord.
+ * SpecificBaseRecord throws null pointer exception for record.get(name) if
name is not present in the schema of the
+ * record (which happens when converting a SpecificBaseRecord without
hoodie_metadata_fields to a new record with it).
+ * In this case, we do NOT set the defaults for the hoodie_metadata_fields
explicitly, instead, the new record assumes
+ * the default defined in the AvroSchema itself.
Review comment:
AvroSchema -> avro schema.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]