Hi, We're using Avro as the storage format for database records, and schema evolution is a key feature for us. I have a question regarding the deletion of fields from a record, when a schema is changed.
Let's say a field X that is present in v1 of the schema, but does not define a default value, is deleted in v2 of the schema. There can be a mix of v1 and v2 records in the database, and a mix of v1 and v2 client apps (apps that use v1 or v2 as their writer and reader schema). If a v1 app reads a v2 record (written by a v2 app), an exception will be thrown because the reader schema contains field X, the record being deserialized does not contain field X, and the reader schema does not contain a default value for field X. Therefore, our conclusion is that a default value must be defined for each field in a schema, in order to support deletion of that field from the schema at a future time. To delete a field that does not define a default value, the only possibility would be to upgrade all clients to v2 before using the v2 schema for writing. This is usually impractical in a large distributed system. My question is: Does this make sense -- have I got it right? Thanks in advance, --mark
