rdblue commented on a change in pull request #4301:
URL: https://github.com/apache/iceberg/pull/4301#discussion_r840964022



##########
File path: format/spec.md
##########
@@ -193,6 +193,17 @@ Notes:
 
 For details on how to serialize a schema to JSON, see Appendix C.
 
+#### Default value
+
+Default value can be assigned to a column when the column is added to an 
Iceberg table as part of the schema evolution. They are tracked at the level of 
a nested field inside a struct, thus it can be used for both top-level columns 
and nested columns. Iceberg tracks two default values internally: 
`initial-default` and `write-default`. The `initial-default` is used to read 
rows belonging to files that lack the column (i.e. the files were written 
before the column is added); the `write-default` value will be used for the 
automatically populating the column if user later inserts new rows without 
specifying the column.

Review comment:
       @wmoustafa, I think that these should _not_ be symmetric because they 
have very different behaviors. The initial default can only be set when the 
column is created. The write default can change any time. Also, the initial 
default is explicitly not a read-time default. While it is applied at read 
time, I think that `file-to-record` or `read-time` imply that the default can 
be "used" by not writing columns into data files, which is explicitly 
disallowed.
   
   I'd keep the `initial-default` and `write-default` names for clarity.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to