phet commented on code in PR #3643:
URL: https://github.com/apache/gobblin/pull/3643#discussion_r1122365020
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergHiveCatalog.java:
##########
@@ -17,25 +17,44 @@
package org.apache.gobblin.data.management.copy.iceberg;
-import lombok.AllArgsConstructor;
-import lombok.extern.slf4j.Slf4j;
+import java.util.Map;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.iceberg.CatalogProperties;
+import org.apache.iceberg.TableOperations;
import org.apache.iceberg.catalog.TableIdentifier;
import org.apache.iceberg.hive.HiveCatalog;
+import lombok.extern.slf4j.Slf4j;
+
/**
* Hive-Metastore-based {@link IcebergCatalog}.
*/
@Slf4j
-@AllArgsConstructor
-public class IcebergHiveCatalog implements IcebergCatalog {
+
+public class IcebergHiveCatalog extends BaseIcebergCatalog {
+ public static final String HIVE_CATALOG_NAME = "HiveCatalog";
// NOTE: specifically necessitates `HiveCatalog`, as
`BaseMetastoreCatalog.newTableOps` is `protected`!
- private final HiveCatalog hc;
+ private HiveCatalog hc;
+
+ public IcebergHiveCatalog() {
+ super(HIVE_CATALOG_NAME, HiveCatalog.class);
+ }
@Override
- public IcebergTable openTable(String dbName, String tableName) {
- TableIdentifier tableId = TableIdentifier.of(dbName, tableName);
- return new IcebergTable(tableId, hc.newTableOps(tableId));
+ public void initialize(Map<String, String> properties, Configuration
configuration) {
+ hc = (HiveCatalog) createCompanionCatalog(properties, configuration);
}
+
+ @Override
+ public String getCatalogUri() {
+ return hc.getConf().get(CatalogProperties.URI, "");
Review Comment:
to avoid head scratching, should we consider a default other than `""`, such
as `"<<not set>>"`? if it's valid to have no uri, then keep as-is, but if
generally considered a mistake/omission, let's use a placeholder.
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDataset.java:
##########
@@ -319,11 +320,11 @@ protected static Optional<URI>
getAsOptionalURI(Properties props, String key) {
}
protected DatasetDescriptor getSourceDataset(FileSystem sourceFs) {
- return getDatasetDescriptor(sourceCatalogMetastoreURI, sourceFs);
+ return getDatasetDescriptor(sourceCatalogURI, sourceFs);
Review Comment:
is this a dataset-level descriptor? seems it would be the same for all
datasets from the same catalog, rather than distinct for each... is that really
sufficient?
perhaps `IcebergTable` deserves a `getDatasetDescriptor(FileSystem)`
method...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]