RussellSpitzer commented on code in PR #16263:
URL: https://github.com/apache/iceberg/pull/16263#discussion_r3222213526


##########
core/src/main/java/org/apache/iceberg/ManifestReader.java:
##########
@@ -417,14 +417,9 @@ public ManifestEntry<F> apply(ManifestEntry<F> entry) {
         }
       };
     } else {
-      // data file's first_row_id is null when the manifest's first_row_id is 
null
-      return entry -> {
-        if (entry.file() instanceof BaseFile) {
-          ((BaseFile<?>) entry.file()).setFirstRowId(null);
-        }
-
-        return entry;
-      };
+      // Preserve the source entry’s first row ID even if the manifest hasn’t 
assigned one since it
+      // may be EXISTING
+      return Function.identity();

Review Comment:
   +1 on two assigners.
   
   I agree we should preserve the defensive mode for the committed-manifest 
path, but I don't think it should be a silent null assignment. If the invariant 
is "a committed v3 manifest with firstRowId != null shouldn't contain an entry 
with a non-null first_row_id that wasn't already there," we should encode that 
as a precondition / checkState and fail loudly. Silent re-assignment masks 
exactly the class of bug we're fixing here. If there's a real case where 
re-assigning is necessary, that seems like it should be opt-in behind a config 
flag rather than the default.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to