rdblue commented on code in PR #16263:
URL: https://github.com/apache/iceberg/pull/16263#discussion_r3228815902


##########
core/src/main/java/org/apache/iceberg/ManifestReader.java:
##########
@@ -417,14 +417,9 @@ public ManifestEntry<F> apply(ManifestEntry<F> entry) {
         }
       };
     } else {
-      // data file's first_row_id is null when the manifest's first_row_id is 
null
-      return entry -> {
-        if (entry.file() instanceof BaseFile) {
-          ((BaseFile<?>) entry.file()).setFirstRowId(null);
-        }
-
-        return entry;
-      };
+      // Preserve the source entry’s first row ID even if the manifest hasn’t 
assigned one since it
+      // may be EXISTING
+      return Function.identity();

Review Comment:
   > I do, but I think I'm currently overruled by the majority. I don't like it 
when we change objects to "fix" them
   
   I think this is reasonable. I'm okay with a precondition if that's what 
others think we should do. My argument for the opposite is that usually when we 
_can_ read correctly, we should. We know that in the case where there is no 
manifest `first_row_id` that this should be null. Is it worth failing a read?
   
   It might be. Russell is correct that it signals a problem somewhere. It 
probably doesn't matter much either way, since this should be rare.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to