[
https://issues.apache.org/jira/browse/GOBBLIN-1786?focusedWorklogId=847066&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-847066
]
ASF GitHub Bot logged work on GOBBLIN-1786:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 22/Feb/23 22:15
Start Date: 22/Feb/23 22:15
Worklog Time Spent: 10m
Work Description: meethngala commented on code in PR #3643:
URL: https://github.com/apache/gobblin/pull/3643#discussion_r1115023866
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergCatalog.java:
##########
@@ -17,10 +17,25 @@
package org.apache.gobblin.data.management.copy.iceberg;
+import org.apache.iceberg.catalog.Catalog;
+
/**
* Any catalog from which to access {@link IcebergTable}s.
*/
public interface IcebergCatalog {
IcebergTable openTable(String dbName, String tableName);
+ String getCatalogUri();
+
+ /**
+ * Adding a sub interface to help us provide an association between {@link
Catalog} and {@link IcebergCatalog}.
+ * This helps us resolve to the Catalog to its concrete implementation class
+ * Primarily needed to access `newTableOps` method which only certain {@link
Catalog} derived classes open for public access
+ */
+ interface CatalogSpecifier {
Review Comment:
@ZihanLi58 @Will-Lo The interface is introduced primarily to bridge the gap
between `IcebergCatalog` (exists in the Gobblin world as part of distcp) and
the concrete implementation of the type of catalog for eg. `HiveCatalog` which
implements `org.apache.iceberg.catalog`. Now, `newTableOps` is required for us
to interact with the iceberg tables in order to perform distcp. But since the
method `newTableOps` is protected, we need to resolve `Catalog` to its concrete
implementation and also pair `Catalog` (exists in Iceberg world) with
`IcebergCatalog` (exists in Gobblin world) to perform iceberg based distcp
Now, we can do this through configs as well, but essentially one config is
needed i.e. the specifier and we can resolve everything else from the specific
type of catalog. This reduces the hassle of adding additional props in the
configs/job templates and also gives the liberty to extend the functionality in
future when we support other catalog types and few functionality might be
specific to that catalog type. Let me know if this helps!
Issue Time Tracking
-------------------
Worklog Id: (was: 847066)
Time Spent: 3h 20m (was: 3h 10m)
> Support Other Catalog Types for Iceberg Distcp
> ----------------------------------------------
>
> Key: GOBBLIN-1786
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1786
> Project: Apache Gobblin
> Issue Type: Improvement
> Reporter: Meeth Gala
> Priority: Major
> Time Spent: 3h 20m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)