danielcweeks commented on code in PR #14264:
URL: https://github.com/apache/iceberg/pull/14264#discussion_r2417576263


##########
core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java:
##########
@@ -71,6 +80,12 @@ protected CloseableIterable<ChangelogScanTask> doPlanFiles(
             .filter(manifest -> 
changelogSnapshotIds.contains(manifest.snapshotId()))
             .toSet();
 
+    // Build delete file index for existing deletes (before the start snapshot)
+    DeleteFileIndex existingDeleteIndex = 
buildExistingDeleteIndex(fromSnapshotIdExclusive);

Review Comment:
   @pvary I'm not sure the case you provide is accurate because we would only 
be producing changes for S3, not snapshots prior to it.  If this is 
incremental, it should only be the observed changes within the range, not 
changes prior to it.  Since deletes only affect prior data, it would have no 
effect on the results of the scan.
   
   You are correct that equality deletes do not apply to newer data, so even 
with equality deletes, it would only apply to older data that would not be part 
of the scan.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to