wmoustafa commented on a change in pull request #4301:
URL: https://github.com/apache/iceberg/pull/4301#discussion_r824020847
##########
File path: format/spec.md
##########
@@ -193,10 +193,38 @@ Notes:
For details on how to serialize a schema to JSON, see Appendix C.
+#### Default value
+Default values can be assigned to top-level columns or nested fields. Default
values are used during schema evolution when adding a new column. The default
value is used to read rows belonging to the files that lack the column or
nested field prior to the schema evolution.
Review comment:
This situation emerges when engines want to support `INSERT INTO` with a
subset of columns. Note that the columns must already exist. Now, the `schema
evolution` default value may or may not exist (depending on whether the column
missing in `INSERT INTO` was added after the table was created (i.e., schema
evolution) or existed since the table was first created (i.e., no schema
evolution), respectively).
If the `schema evolution` default value exists: I think it is fair to
_reuse the value_ to fill in the missing column values in `INSERT INTO`.
If the `schema evolution` default value does not exists: I think it is fair
to _reuse the current place in the metadata_ to define this default value
(upon table creation this time, and not upon schema evolution). This will never
conflict with the `schema evolution` default value of the same column because
the column _already exists_.
Going by the above, I think it is fine to use the same concept for the
default value to serve both use cases.
For the current spec, we may state that we do not support that anyways, and
hence default value cannot be leveraged on the write path. It can be a future
extension though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]