umehrot2 commented on a change in pull request #1427: [HUDI-727]: Copy default
values of fields if not present when rewriting incoming record with new schema
URL: https://github.com/apache/incubator-hudi/pull/1427#discussion_r396877351
##########
File path:
hudi-common/src/test/java/org/apache/hudi/common/util/TestHoodieAvroUtils.java
##########
@@ -57,4 +60,16 @@ public void testPropsPresent() {
}
Assert.assertTrue("column pii_col doesn't show up", piiPresent);
}
+
+ @Test
+ public void testDefaultValue() {
+ GenericRecord rec = new GenericData.Record(new
Schema.Parser().parse(EXAMPLE_SCHEMA));
+ rec.put("_row_key", "key1");
+ rec.put("non_pii_col", "val1");
+ rec.put("pii_col", "val2");
+ rec.put("timestamp", 3.5);
Review comment:
So the issue seems to be that in the original record created in this way,
the default values shows up as `null`. Even though you have specified `default:
dummy_val` it still is showing up as `null` in the original record.
Do you know why that is the case ? When we have specified the default value,
why doesn't Avro put it in the record when the field is missing ?
I tried using the builder, but that expects default values to be specified
for each and every field else throws an excpetion:
```
GenericRecord rec = new GenericRecordBuilder(new
Schema.Parser().parse(EXAMPLE_SCHEMA)).build();
```
Do you have more research points around why this is the case with Avro ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services