manishmalhotrawork commented on a change in pull request #524: respect
commit.manifest.min.count
URL: https://github.com/apache/incubator-iceberg/pull/524#discussion_r341456261
##########
File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
##########
@@ -595,6 +596,9 @@ private Evaluator
extractInclusiveDeleteExpression(ManifestReader reader) {
if (bin.contains(cachedNewManifest) && bin.size() <
minManifestsCountToMerge) {
// not enough to merge, add all manifest files to the output list
outputManifests.addAll(bin);
+ } else if ((!Collections.disjoint(bin, appendManifests)) &&
bin.size() < minManifestsCountToMerge) {
Review comment:
thanks @rdblue for the explanation. Sorry for the delay in reply !
> The check above, bin.contains(cachedNewManifest) is intended to catch the
last bin. Only the last bin is left unmerged, so that it can accumulate more
manifests and isn't merged every time. But the bin before the last can be
merged if it is full.
it means if `bin.contains(cachedNewManifest)` is true, then this the last
bin.
Because latest added manifests/files has to be in the last bin?
To handle the appendManifest case, we can maintain one more variables
`cachedNewAppenedManifest`, which will be initialized by new `ManifestFile`
supplied to `appendManifests(ManifestFile manifestFile)`
condition could be:
```
else if (bin.contains(cachedNewAppenedManifest) && bin.size() <
minManifestsCountToMerge) {
// not enough to merge, add all manifest files to the output list
outputManifests.addAll(bin);
}
```
which means, if the `cachedNewAppenedManifest` (latest appendedManifest
file) is present in the bin, then it would be the last one.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]