IIUC the link between backwards compatibility and default values for fields in schema is that, in avro you define it as an optional field (union of type and null).. Not sure if it has anything to do with default values.
Nonetheless, we should copy over the default values, if that code is not doing so. https://github.com/apache/incubator-hudi/blob/078d4825d909b2c469398f31c97d2290687321a8/hudi-common/src/main/java/org/apache/hudi/common/util/HoodieAvroUtils.java#L124 seems like it's being copied over? On Mon, Feb 24, 2020 at 4:21 AM Pratyaksh Sharma <[email protected]> wrote: > Hi, > > Currently we recommend users to evolve schema in backwards compatible way. > When one is trying to evolve schema in backwards compatible way, one of the > most significant things to do is to define default value for newly added > columns so that records published with previous schema also can be consumed > properly. > > However just before actually writing record to Hudi dataset, we try to > rewrite record with new Avro schema which has Hudi metadata columns [1]. In > this function, we are only trying to get the values from record without > considering field's default value. As a result, schema validation fails. In > essence this feels like we are not even respecting backwards compatible > schema changes. > IMO, this piece of code should take into account default value as well in > case field's actual value is null. > > Open to hearing others' thoughts. > > [1] > > https://github.com/apache/incubator-hudi/blob/078d4825d909b2c469398f31c97d2290687321a8/hudi-common/src/main/java/org/apache/hudi/common/util/HoodieAvroUtils.java#L205 > . >
