rdblue commented on a change in pull request #4301:
URL: https://github.com/apache/iceberg/pull/4301#discussion_r824998794
##########
File path: format/spec.md
##########
@@ -193,10 +193,38 @@ Notes:
For details on how to serialize a schema to JSON, see Appendix C.
+#### Default value
+Default values can be assigned to top-level columns or nested fields. Default
values are used during schema evolution when adding a new column. The default
value is used to read rows belonging to the files that lack the column or
nested field prior to the schema evolution.
Review comment:
Code snippet (3) is an alternative to code snippet (2) that doesn't
expose `setInitialColumnDefault` and is demonstrating the same behavior as code
snippet (1). I'm not sure what you mean that this "has the side effect of not
distinguishing between organic 34 and 34 from schema evolution". This is how
you would change the initial default for a column: by dropping and re-creating
the column with a different default. That works because the use case is that
you've made a mistake and need to correct it _before writing any data_. The
point here is that we don't necessarily need to add `setInitialColumnDefault`.
The behavior of `dropColumn` is the standard behavior: the column is no
longer present in the table. Rewrites are not required to keep column values
for dropped columns.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]