rdblue commented on a change in pull request #4301:
URL: https://github.com/apache/iceberg/pull/4301#discussion_r840964022
##########
File path: format/spec.md
##########
@@ -193,6 +193,17 @@ Notes:
For details on how to serialize a schema to JSON, see Appendix C.
+#### Default value
+
+Default value can be assigned to a column when the column is added to an
Iceberg table as part of the schema evolution. They are tracked at the level of
a nested field inside a struct, thus it can be used for both top-level columns
and nested columns. Iceberg tracks two default values internally:
`initial-default` and `write-default`. The `initial-default` is used to read
rows belonging to files that lack the column (i.e. the files were written
before the column is added); the `write-default` value will be used for the
automatically populating the column if user later inserts new rows without
specifying the column.
Review comment:
@wmoustafa, I think that these should _not_ be symmetric because they
have very different behaviors. The initial default can only be set when the
column is created. The write default can change any time. Also, the initial
default is explicitly not a read-time default. While it is applied at read
time, I think that `file-to-record` or `read-time` imply that the default can
be "used" by not writing columns into data files, which is explicitly
disallowed.
I'd keep the `initial-default` and `write-default` names for clarity.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]