Hi,

Currently we recommend users to evolve schema in backwards compatible way.
When one is trying to evolve schema in backwards compatible way, one of the
most significant things to do is to define default value for newly added
columns so that records published with previous schema also can be consumed
properly.

However just before actually writing record to Hudi dataset, we try to
rewrite record with new Avro schema which has Hudi metadata columns [1]. In
this function, we are only trying to get the values from record without
considering field's default value. As a result, schema validation fails. In
essence this feels like we are not even respecting backwards compatible
schema changes.
IMO, this piece of code should take into account default value as well in
case field's actual value is null.

Open to hearing others' thoughts.

[1]
https://github.com/apache/incubator-hudi/blob/078d4825d909b2c469398f31c97d2290687321a8/hudi-common/src/main/java/org/apache/hudi/common/util/HoodieAvroUtils.java#L205
.

Reply via email to