Verest commented on code in PR #14052:
URL: https://github.com/apache/iceberg/pull/14052#discussion_r2342260052


##########
api/src/test/java/org/apache/iceberg/TestSchema.java:
##########
@@ -213,7 +213,7 @@ public void testSupportedInitialDefault(int formatVersion) {
   @ParameterizedTest
   @FieldSource("org.apache.iceberg.TestHelpers#ALL_VERSIONS")
   public void testSupportedWriteDefault(int formatVersion) {
-    // only the initial default is a forward-incompatible change
+    // TODO: need to clarify if this needs to be updated to block for <v3
     assertThatCode(() -> Schema.checkCompatibility(WRITE_DEFAULT_SCHEMA, 
formatVersion))
         .doesNotThrowAnyException();
   }

Review Comment:
   I don't understand why validation was not added for throwing if a 
`write-default` was detected within a v1/v2 table's schema.  There is some 
reasoning around forward compatibility listed 
[here](https://github.com/apache/iceberg/pull/11434) for `initial-default`.
   
   I understand the forward-incompatible comment with respect to reads, but 
isn't the `write-default` forward-incompatible with respect to writes?  E.g. a 
v2 engine will not correctly write a query result for a non-provided field that 
should default to an existing `write-default`.
   
   Related from the spec: "The write-default is a forward-compatible change 
because it is only used at write time. Old writers will fail because the field 
is missing."  Do old writers not just default the value to null?  E.g. `INSERT 
INTO <table> (id) VALUES (1)` will put `null` for any non-id column, which in 
this case would be incorrect if a `write-default` was set for those columns.
   - Edit: this is not the behavior for spark, it throws for missing columns.
   
   Am I correct in that `write-default` validation should be added?  If not, 
does `Schema.checkCompatibility` only consider reads, am I missing context, etc?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to