rdblue commented on a change in pull request #631: Add mechanism to expire old
metadata versions
URL: https://github.com/apache/incubator-iceberg/pull/631#discussion_r346013219
##########
File path: core/src/main/java/org/apache/iceberg/TableMetadataParser.java
##########
@@ -266,8 +278,20 @@ static TableMetadata fromJson(TableOperations ops,
InputFile file, JsonNode node
}
}
+ SortedSet<MetadataLogEntry> metadataEntries =
+
Sets.newTreeSet(Comparator.comparingLong(MetadataLogEntry::timestampMillis));
+ if (node.has(METADATA_LOG)) {
+ Iterator<JsonNode> logIterator = node.get(METADATA_LOG).elements();
+ while (logIterator.hasNext()) {
+ JsonNode entryNode = logIterator.next();
+ metadataEntries.add(new MetadataLogEntry(
+ JsonUtil.getLong(TIMESTAMP_MS, entryNode),
JsonUtil.getString(METADATA_FILE, entryNode)));
+ }
+ }
+
return new TableMetadata(ops, file, uuid, location,
lastUpdatedMillis, lastAssignedColumnId, schema, defaultSpecId, specs,
properties,
- currentVersionId, snapshots, ImmutableList.copyOf(entries.iterator()));
+ currentVersionId, snapshots, ImmutableList.copyOf(entries.iterator()),
+ ImmutableList.copyOf(metadataEntries.iterator()), null);
Review comment:
I don't think that `TableMetadata` should change when it is serialized and
deserialized. That's what happens if `TableMetadata` is used to track the old
metadata locations that were removed.
I'd prefer to change this so that only the previous metadata entries are
tracked on `TableMetadata`, and then the `commit` method should delete entries
in
`Sets.newHashSet(baseMetadata.previousMetadataFiles()).removeAll(newMetadata.previousMetadataFiles())`.
Does that make sense?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]