Blazer-007 commented on code in PR #4145:
URL: https://github.com/apache/gobblin/pull/4145#discussion_r2432607092


##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergTable.java:
##########
@@ -315,4 +317,29 @@ protected void overwritePartition(List<DataFile> 
dataFiles, String partitionColN
     log.info("~{}~ SnapshotId after overwrite: {}", tableId, 
accessTableMetadata().currentSnapshot().snapshotId());
   }
 
+  /**
+   * update table's schema to the provided {@link Schema}
+   * @param updatedSchema the updated schema to be set on the table.
+   * @throws TableNotFoundException if the table does not exist.
+   */
+  public void updateSchema(Schema updatedSchema) throws TableNotFoundException 
{
+    TableMetadata currentTableMetadata = accessTableMetadata();
+    Schema currentSchema = currentTableMetadata.schema();
+
+    if (currentSchema.sameSchema(updatedSchema)) {
+      log.info("~{}~ schema is already up-to-date", tableId);
+      return;
+    }
+
+    log.info("~{}~ updating schema from {} to {}", tableId, currentSchema, 
updatedSchema);
+
+    TableMetadata updatedTableMetadata = 
currentTableMetadata.updateSchema(updatedSchema, 
updatedSchema.highestFieldId());
+    
Preconditions.checkArgument(updatedTableMetadata.schema().sameSchema(updatedSchema),
 "Schema mismatch after update, please check destination table");
+
+    tableOps.commit(currentTableMetadata, updatedTableMetadata);
+    tableOps.refresh();
+
+    log.info("~{}~ schema updated successfully", tableId);

Review Comment:
   Also schema update and data files commit should be done in one transaction 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to