[ 
https://issues.apache.org/jira/browse/GOBBLIN-1786?focusedWorklogId=847066&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-847066
 ]

ASF GitHub Bot logged work on GOBBLIN-1786:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Feb/23 22:15
            Start Date: 22/Feb/23 22:15
    Worklog Time Spent: 10m 
      Work Description: meethngala commented on code in PR #3643:
URL: https://github.com/apache/gobblin/pull/3643#discussion_r1115023866


##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergCatalog.java:
##########
@@ -17,10 +17,25 @@
 
 package org.apache.gobblin.data.management.copy.iceberg;
 
+import org.apache.iceberg.catalog.Catalog;
+
 
 /**
  * Any catalog from which to access {@link IcebergTable}s.
  */
 public interface IcebergCatalog {
   IcebergTable openTable(String dbName, String tableName);
+  String getCatalogUri();
+
+  /**
+   * Adding a sub interface to help us provide an association between {@link 
Catalog} and {@link IcebergCatalog}.
+   * This helps us resolve to the Catalog to its concrete implementation class
+   * Primarily needed to access `newTableOps` method which only certain {@link 
Catalog} derived classes open for public access
+   */
+  interface CatalogSpecifier {

Review Comment:
   @ZihanLi58 @Will-Lo The interface is introduced primarily to bridge the gap 
between `IcebergCatalog` (exists in the Gobblin world as part of distcp) and 
the concrete implementation of the type of catalog for eg. `HiveCatalog` which 
implements `org.apache.iceberg.catalog`. Now, `newTableOps` is required for us 
to interact with the iceberg tables in order to perform distcp. But since the 
method `newTableOps` is protected, we need to resolve `Catalog` to its concrete 
implementation and also pair `Catalog` (exists in Iceberg world) with 
`IcebergCatalog` (exists in Gobblin world) to perform iceberg based distcp
   
   Now, we can do this through configs as well, but essentially one config is 
needed i.e. the specifier and we can resolve everything else from the specific 
type of catalog. This reduces the hassle of adding additional props in the 
configs/job templates and also gives the liberty to extend the functionality in 
future when we support other catalog types and few functionality might be 
specific to that catalog type. Let me know if this helps!





Issue Time Tracking
-------------------

    Worklog Id:     (was: 847066)
    Time Spent: 3h 20m  (was: 3h 10m)

> Support Other Catalog Types for Iceberg Distcp
> ----------------------------------------------
>
>                 Key: GOBBLIN-1786
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1786
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Meeth Gala
>            Priority: Major
>          Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to