rymurr commented on a change in pull request #3257:
URL: https://github.com/apache/iceberg/pull/3257#discussion_r751066816



##########
File path: build.gradle
##########
@@ -551,12 +551,16 @@ project(':iceberg-nessie') {
     implementation project(':iceberg-core')
     implementation project(path: ':iceberg-bundled-guava', configuration: 
'shadow')
     implementation "org.projectnessie:nessie-client"
+    implementation "com.fasterxml.jackson.core:jackson-databind"

Review comment:
       is this a new dependency or just being explicit about an existing one? 
If new does this change the size of the runtime jars?

##########
File path: nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
##########
@@ -322,12 +324,12 @@ public Configuration getConf() {
     return config;
   }
 
-  TreeApi getTreeApi() {
-    return client.getTreeApi();
+  public NessieApiV1 getApi() {

Review comment:
       why is this public? I don't think we should be exposing this

##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -717,6 +717,19 @@ public TableMetadata removeSnapshotLogEntries(Set<Long> 
snapshotIds) {
         snapshots, newSnapshotLog, addPreviousFile(file, lastUpdatedMillis));
   }
 
+  public TableMetadata withUpdatedSnapshotAndSchema(long snapshotId, int 
schemaId, int sortOrderId, int specId) {

Review comment:
       I think these would be more useful as separate methods. There is no way 
to only update `snapshotId` for example and all 4 args have to be given valid 
values or it throws. Seems overly restrictive and not very composable.

##########
File path: nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
##########
@@ -170,23 +174,23 @@ public boolean dropTable(TableIdentifier identifier, 
boolean purge) {
       return false;
     }
 
-    Operations contents = ImmutableOperations.builder()
-        
.addOperations(ImmutableDelete.builder().key(NessieUtil.toKey(identifier)).build())
-        .commitMeta(NessieUtil.buildCommitMetadata(String.format("iceberg 
delete table '%s'", identifier),
+    CommitMultipleOperationsBuilder commitBuilderBase = 
api.commitMultipleOperations()

Review comment:
       given that we ignore the purge operation (as it would interfere with 
branch aware gc) we should log out that it was set to true but we are ignoring 
it.

##########
File path: 
core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java
##########
@@ -184,13 +184,17 @@ protected void refreshFromMetadataLocation(String 
newLocation, Predicate<Excepti
             "Table UUID does not match: current=%s != refreshed=%s", 
currentMetadata.uuid(), newUUID);
       }
 
-      this.currentMetadata = newMetadata.get();
+      this.currentMetadata = validateRefreshedMetadata(newMetadata.get());
       this.currentMetadataLocation = newLocation;
       this.version = parseVersion(newLocation);
     }
     this.shouldRefresh = false;
   }
 
+  protected TableMetadata validateRefreshedMetadata(TableMetadata metadata) {

Review comment:
       agreed. I would go further and say this additional protected method 
feels fragile and not a great use of class hierarchies. I would prefer 
`NessieTableOperations` just reimplements the above 
`refreshFromMetadataLocation` completely rather than adding this shim

##########
File path: nessie/src/main/java/org/apache/iceberg/nessie/NessieCatalog.java
##########
@@ -338,21 +340,32 @@ String currentRefName() {
     return reference.getName();
   }
 
+  FileIO fileIO() {

Review comment:
       what is the point of this addition?

##########
File path: 
nessie/src/main/java/org/apache/iceberg/nessie/NessieTableOperations.java
##########
@@ -127,14 +155,32 @@ protected void doCommit(TableMetadata base, TableMetadata 
metadata) {
       delete = false;
       throw new CommitStateUnknownException(ex);
     } catch (NessieNotFoundException ex) {
-      throw new RuntimeException(String.format("Commit failed: Reference %s no 
longer exist", reference.getName()), ex);
+      throw new RuntimeException(
+          String.format("Commit failed: Reference %s no longer exist", 
reference.getName()), ex);
     } finally {
       if (delete) {
         io().deleteFile(newMetadataLocation);
       }
     }
   }
 
+  private ImmutableCommitMeta.Builder buildCommitMessage(TableMetadata base, 
TableMetadata metadata) {
+    Builder commitMeta = ImmutableCommitMeta.builder();
+    String commitMessage;
+    Snapshot snapshot = metadata.currentSnapshot();
+    if (snapshot != null && (base == null || base.currentSnapshot() == null ||
+        snapshot.snapshotId() != base.currentSnapshot().snapshotId())) {
+      commitMessage = String.format("Iceberg %s against %s", 
snapshot.operation(), tableName());
+      commitMeta.putProperties("iceberg.operation", snapshot.operation());
+    } else if (base != null && metadata.currentSchemaId() != 
base.currentSchemaId()) {
+      commitMessage = String.format("Iceberg schema change against %s", 
tableName());
+    } else {
+      commitMessage = String.format("Iceberg commit against %s", tableName());
+    }
+
+    return commitMeta.message(commitMessage);

Review comment:
       I also find this rather confusing to read, some rather complicated 
booleans floating around. This seems like complexity in the name of removing 
code duplication




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to