[ 
https://issues.apache.org/jira/browse/GOBBLIN-1994?focusedWorklogId=904192&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-904192
 ]

ASF GitHub Bot logged work on GOBBLIN-1994:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Feb/24 01:36
            Start Date: 08/Feb/24 01:36
    Worklog Time Spent: 10m 
      Work Description: Will-Lo commented on code in PR #3876:
URL: https://github.com/apache/gobblin/pull/3876#discussion_r1482310566


##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergRegisterStep.java:
##########
@@ -65,26 +67,40 @@ public boolean isCompleted() throws IOException {
 
   @Override
   public void execute() throws IOException {
-    IcebergTable destIcebergTable = 
IcebergDatasetFinder.createIcebergCatalog(this.properties, 
IcebergDatasetFinder.CatalogLocation.DESTINATION)
-        .openTable(TableIdentifier.parse(destTableIdStr));
-    // NOTE: solely by-product of probing table's existence: metadata recorded 
just prior to reading from source catalog is what's actually used
-    TableMetadata unusedNowCurrentDestMetadata = null;
+    IcebergTable destIcebergTable = 
createDestinationCatalog().openTable(TableIdentifier.parse(destTableIdStr));
     try {
-      unusedNowCurrentDestMetadata = destIcebergTable.accessTableMetadata(); 
// probe... (first access could throw)
-      log.info("~{}~ (destination) (using) TableMetadata: {} - {} {}= 
(current) TableMetadata: {} - {}",
-          destTableIdStr,
+      TableMetadata currentDestMetadata = 
destIcebergTable.accessTableMetadata(); // probe... (first access could throw)
+      // CRITICAL: verify current dest-side metadata remains the same as 
observed just prior to first loading source catalog table metadata
+      boolean isJustPriorDestMetadataStillCurrent = 
currentDestMetadata.uuid().equals(justPriorDestTableMetadata.uuid())
+          && 
currentDestMetadata.metadataFileLocation().equals(justPriorDestTableMetadata.metadataFileLocation());
+      String determinationMsg = String.format("(just prior) TableMetadata: {} 
- {} {}= (current) TableMetadata: {} - {}",
           justPriorDestTableMetadata.uuid(), 
justPriorDestTableMetadata.metadataFileLocation(),
-          
unusedNowCurrentDestMetadata.uuid().equals(justPriorDestTableMetadata.uuid()) ? 
"=" : "!",
-          unusedNowCurrentDestMetadata.uuid(), 
unusedNowCurrentDestMetadata.metadataFileLocation());
+          isJustPriorDestMetadataStillCurrent ? "=" : "!",
+          currentDestMetadata.uuid(), 
currentDestMetadata.metadataFileLocation());
+      log.info("~{}~ [destination] {}", destTableIdStr, determinationMsg);
+
+      // NOTE: we originally expected the dest-side catalog to enforce this 
match, but the client-side `BaseMetastoreTableOperations.commit`
+      // uses `==`, rather than `.equals` (value-cmp), and that invariably 
leads to:
+      //   org.apache.iceberg.exceptions.CommitFailedException: Cannot commit: 
stale table metadata
+      if (!isJustPriorDestMetadataStillCurrent) {
+        throw new IOException("error: likely concurrent writing to 
destination: " + determinationMsg);

Review Comment:
   What are the scenarios that can lead to this exception? Does it ever make 
sense to have a retryer here or if it's recoverable at all?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 904192)
    Time Spent: 1h 10m  (was: 1h)

> Ensure iceberg-distcp replication consistency
> ---------------------------------------------
>
>                 Key: GOBBLIN-1994
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1994
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-core
>            Reporter: Kip Kohn
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Ensure iceberg-distcp consistency by using same `TableMetadata` for both WU 
> planning and final commit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to