rdblue commented on code in PR #15006:
URL: https://github.com/apache/iceberg/pull/15006#discussion_r2719132780
##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -1073,6 +1088,125 @@ private List<ManifestFile> newDeleteFilesAsManifests() {
return cachedNewDeleteManifests;
}
+ // Merge duplicates, internally takes care of updating newDeleteFilesBySpec
to remove
+ // duplicates and add the newly merged DV
+ private void mergeDVsAndWrite() {
Review Comment:
I think this method should return a `List<DeleteFile>` that represents all
of the DVs after merging has been done. It should probably also use a cache to
ensure that multiple calls produce the same list and reuse as much work as
possible, although this could be done in a follow up and I could be convinced
to cache at a different level (like delete manifests).
Then the caller should take the list produced here, combine it with the v2
`DeleteFile` list, group by spec, and write delete manifests. That would
require no side-effects other than caching.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]