rockwotj commented on code in PR #13936: URL: https://github.com/apache/iceberg/pull/13936#discussion_r2324942614
########## format/spec.md: ########## @@ -1861,6 +1861,16 @@ Java writes `-1` for "no current snapshot" with V1 and V2 tables and considers t Some implementations require that GZIP compressed files have the suffix `.gz.metadata.json` to be read correctly. The Java reference implementation can additionally read GZIP compressed files with the suffix `metadata.json.gz`. +### Schema evolution and writing with old schemas + +Writers must write out all fields with the types specified from a schema present in table metadata. Writers should use the latest schema for writing. Not writing out all columns or not using the latest schema can change the semantics of the data written. The following are possible inconsistencies that can be introduced: + +* For all null columns, not writing out the column would cause `initial-default` value would be applied on reading instead of `null`. +* If `write-default` has been changed then using an out-of-date schema would result in the incorrect value being populated. +* If a `write` is the result of a partial row update (e.g. `update table set col_y = 'xyz'`) an out-of-date schema would silently drop values. Review Comment: Can you clarify this? When could this happen? Is this if the column is dropped? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org