rdblue commented on a change in pull request #524: respect commit.manifest.min.count URL: https://github.com/apache/incubator-iceberg/pull/524#discussion_r334699362
########## File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java ########## @@ -595,6 +596,9 @@ private Evaluator extractInclusiveDeleteExpression(ManifestReader reader) { if (bin.contains(cachedNewManifest) && bin.size() < minManifestsCountToMerge) { // not enough to merge, add all manifest files to the output list outputManifests.addAll(bin); + } else if ((!Collections.disjoint(bin, appendManifests)) && bin.size() < minManifestsCountToMerge) { Review comment: I don't think the logic is quite right. The check above, `bin.contains(cachedNewManifest)` is intended to catch the last bin. Only the last bin is left unmerged, so that it can accumulate more manifests and isn't merged every time. But the bin before the last can be merged if it is full. This logic would prevent any bin with an appended manifest from being merged, even if it isn't the last bin. Maybe we should come up with a better way to detect the last bin, since the `cachedNewManifest` may not be present. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org