[ 
https://issues.apache.org/jira/browse/GOBBLIN-1802?focusedWorklogId=853593&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-853593
 ]

ASF GitHub Bot logged work on GOBBLIN-1802:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 29/Mar/23 07:33
            Start Date: 29/Mar/23 07:33
    Worklog Time Spent: 10m 
      Work Description: phet commented on code in PR #3663:
URL: https://github.com/apache/gobblin/pull/3663#discussion_r1151498974


##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDatasetFinder.java:
##########
@@ -98,20 +107,42 @@ public Iterator<IcebergDataset> getDatasetsIterator() 
throws IOException {
     return findDatasets().iterator();
   }
 
-  protected IcebergDataset createIcebergDataset(String dbName, String tblName, 
IcebergCatalog icebergCatalog, Properties properties, FileSystem fs) {
-    IcebergTable icebergTable = icebergCatalog.openTable(dbName, tblName);
-    return new IcebergDataset(dbName, tblName, icebergTable, properties, fs);
+  /**
+   * Requires both source and destination catalogs to connect to their 
respective {@link IcebergTable}
+   * Note: the destination side {@link IcebergTable} should be present before 
initiating replication
+   * @return {@link IcebergDataset} with its corresponding source and 
destination {@link IcebergTable}
+   */
+  protected IcebergDataset createIcebergDataset(String dbName, String tblName, 
IcebergCatalog sourceIcebergCatalog, IcebergCatalog destinationIcebergCatalog, 
Properties properties, FileSystem fs) throws IOException {
+    IcebergTable srcIcebergTable = sourceIcebergCatalog.openTable(dbName, 
tblName);
+    IcebergTable destIcebergTable = 
destinationIcebergCatalog.openTable(dbName, tblName);
+    
Preconditions.checkArgument(destinationIcebergCatalog.tableAlreadyExists(TableIdentifier.of(dbName,
 tblName)), "Missing Destination Iceberg Table!");

Review Comment:
   be sure to name the table in the exception message (to highlight potentially 
incorrect config)



##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/BaseIcebergCatalog.java:
##########
@@ -17,7 +17,9 @@
 
 package org.apache.gobblin.data.management.copy.iceberg;
 
+import java.io.IOException;

Review Comment:
   likely to be a lint error
   
   (FYI in other files too)



##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergHiveCatalog.java:
##########
@@ -54,4 +56,9 @@ public String getCatalogUri() {
   protected TableOperations createTableOperations(TableIdentifier tableId) {
     return hc.newTableOps(tableId);
   }
+
+  @Override
+  public boolean tableAlreadyExists(TableIdentifier tableId) {

Review Comment:
   I see why it would be difficult to make `tableAlreadyExists()` a method on 
`IcebergTable` (as that would require the table to store a pointer back to its 
catalog), but why not better encapsulate it, to avoid introducing 
`TableIdentifier` into the API?  e..g two forms:
   * `IcebergCatalog.tableAlreadyExists(String dbName, String tableName)` - 
that parallels `openTable`'s signature
   * `IcebergCatalog.tableAlreadyExists(IcebergTable table)` - for this, use 
your new getter on `IcebergTable.tableId`





Issue Time Tracking
-------------------

    Worklog Id:     (was: 853593)
    Time Spent: 2.5h  (was: 2h 20m)

> Register iceberg table metadata update with destination side catalog
> --------------------------------------------------------------------
>
>                 Key: GOBBLIN-1802
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1802
>             Project: Apache Gobblin
>          Issue Type: New Feature
>            Reporter: Meeth Gala
>            Priority: Major
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to