rdblue commented on a change in pull request #1080:
URL: https://github.com/apache/iceberg/pull/1080#discussion_r434004579



##########
File path: core/src/main/java/org/apache/iceberg/BaseRewriteManifests.java
##########
@@ -174,13 +173,13 @@ private ManifestFile copyManifest(ManifestFile manifest) {
 
     validateFilesCounts();
 
-    // TODO: add sequence numbers here
     Iterable<ManifestFile> newManifestsWithMetadata = Iterables.transform(
         Iterables.concat(newManifests, addedManifests, 
rewrittenAddedManifests),
         manifest -> 
GenericManifestFile.copyOf(manifest).withSnapshotId(snapshotId()).build());
 
     // put new manifests at the beginning
-    List<ManifestFile> apply = new ArrayList<>();
+    List<ManifestFile> apply = Lists.newArrayList();
+    apply.addAll(base.currentSnapshot().deleteManifests());

Review comment:
       We should probably update the comment to include delete handling. We put 
new manifests at the front of the list because those are the ones most likely 
to have data for a query when writes align with reads (recent hours are read 
more often than data that's months old).
   
   I guess we don't really need the delete manifests at the start of the list. 
We could put those at the end since they get split out into a separate list. 
The one that matters is scanning the recent manifests first when planning jobs 
to get data faster in engines like Presto that run the query and planning 
concurrently.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to