RussellSpitzer commented on code in PR #12228:
URL: https://github.com/apache/iceberg/pull/12228#discussion_r2078339580
##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -71,23 +70,35 @@ public Table loadTable(TableIdentifier identifier) {
}
@Override
- public Table registerTable(TableIdentifier identifier, String
metadataFileLocation) {
+ public Table registerTable(
+ TableIdentifier identifier, String metadataFileLocation, boolean
overwrite) {
Preconditions.checkArgument(
identifier != null && isValidIdentifier(identifier), "Invalid
identifier: %s", identifier);
Preconditions.checkArgument(
metadataFileLocation != null && !metadataFileLocation.isEmpty(),
"Cannot register an empty metadata file location as a table");
- // Throw an exception if this table already exists in the catalog.
- if (tableExists(identifier)) {
+ // If the table already exists and overwriting is disabled, throw an
exception.
+ if (tableExists(identifier) && !overwrite) {
throw new AlreadyExistsException("Table already exists: %s", identifier);
}
TableOperations ops = newTableOps(identifier);
- InputFile metadataFile = ops.io().newInputFile(metadataFileLocation);
- TableMetadata metadata = TableMetadataParser.read(ops.io(), metadataFile);
- ops.commit(null, metadata);
-
+ TableMetadata newMetadata =
+ TableMetadataParser.read(ops.io(),
ops.io().newInputFile(metadataFileLocation));
+
+ TableMetadata existing = ops.current();
+ if (existing != null && overwrite) {
+ if (existing.metadataFileLocation().equals(metadataFileLocation)) {
+ LOG.info(
+ "The requested metadata matches the existing metadata. No changes
will be committed.");
+ return new BaseTable(ops, fullTableName(name(), identifier),
metricsReporter());
+ }
+ dropTable(identifier, false /* Keep all data and metadata files */);
Review Comment:
Maybe it makes sense to move this into an explicit Table operation api. We
essentially want something that's like
ops.setMetadata(newMetadata)
Which ignores validations and transactionally swaps. May be cleaner than
doing the drop/create we are currently doing. This is essentially what any rest
catalog could do and would fix @stevenzwu 's issues with atomicity by letting
each catalog implementation decide whether to make it atomic or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]