aokolnychyi commented on a change in pull request #2564:
URL: https://github.com/apache/iceberg/pull/2564#discussion_r629841356
##########
File path: core/src/main/java/org/apache/iceberg/MetadataLocationUtils.java
##########
@@ -58,17 +53,22 @@ private static void miscMetadataFiles(String
metadataFileLocation, Set<String> m
if (metadataFileLocation == null) {
return;
}
- try {
- TableMetadata metadata = TableMetadataParser.read(io,
metadataFileLocation);
- List<TableMetadata.MetadataLogEntry> metadataLogEntries =
metadata.previousFiles();
- List<String> previousMetadataFiles =
-
metadataLogEntries.stream().map(TableMetadata.MetadataLogEntry::file).collect(Collectors.toList());
+ TableMetadata metadata = TableMetadataParser.read(io,
metadataFileLocation);
+ List<TableMetadata.MetadataLogEntry> metadataLogEntries =
metadata.previousFiles();
+ List<String> previousMetadataFiles =
+ metadataLogEntries.stream().map(TableMetadata.MetadataLogEntry::file)
+ .collect(Collectors.toList());
+ if (previousMetadataFiles.size() > 0) {
metaFiles.addAll(previousMetadataFiles);
- if (isRecursive && previousMetadataFiles.size() > 0) {
- miscMetadataFiles(previousMetadataFiles.get(0), metaFiles, io,
isRecursive);
+ if (isRecursive) {
Review comment:
I think we can try to load previous metadata files but give up if we
cannot do that. It should be okay to leave those files too. In case if Iceberg
owns the location, we will drop the entire location after we clean up whatever
we could reach.
Listing may not be an option as the same location may contain files for
multiple tables. In HiveCatalog, we have table UUID as part of the file name
while we don't have that in HadoopCatalog for example.
If it is too complicated, we can drop the recursion too and I say we only
clean up files currently referenced. Shouldn't be too bad.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]