phet commented on code in PR #3663:
URL: https://github.com/apache/gobblin/pull/3663#discussion_r1151498974
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDatasetFinder.java:
##########
@@ -98,20 +107,42 @@ public Iterator<IcebergDataset> getDatasetsIterator()
throws IOException {
return findDatasets().iterator();
}
- protected IcebergDataset createIcebergDataset(String dbName, String tblName,
IcebergCatalog icebergCatalog, Properties properties, FileSystem fs) {
- IcebergTable icebergTable = icebergCatalog.openTable(dbName, tblName);
- return new IcebergDataset(dbName, tblName, icebergTable, properties, fs);
+ /**
+ * Requires both source and destination catalogs to connect to their
respective {@link IcebergTable}
+ * Note: the destination side {@link IcebergTable} should be present before
initiating replication
+ * @return {@link IcebergDataset} with its corresponding source and
destination {@link IcebergTable}
+ */
+ protected IcebergDataset createIcebergDataset(String dbName, String tblName,
IcebergCatalog sourceIcebergCatalog, IcebergCatalog destinationIcebergCatalog,
Properties properties, FileSystem fs) throws IOException {
+ IcebergTable srcIcebergTable = sourceIcebergCatalog.openTable(dbName,
tblName);
+ IcebergTable destIcebergTable =
destinationIcebergCatalog.openTable(dbName, tblName);
+
Preconditions.checkArgument(destinationIcebergCatalog.tableAlreadyExists(TableIdentifier.of(dbName,
tblName)), "Missing Destination Iceberg Table!");
Review Comment:
be sure to name the table in the exception message (to highlight potentially
incorrect config)
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/BaseIcebergCatalog.java:
##########
@@ -17,7 +17,9 @@
package org.apache.gobblin.data.management.copy.iceberg;
+import java.io.IOException;
Review Comment:
likely to be a lint error
(FYI in other files too)
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergHiveCatalog.java:
##########
@@ -54,4 +56,9 @@ public String getCatalogUri() {
protected TableOperations createTableOperations(TableIdentifier tableId) {
return hc.newTableOps(tableId);
}
+
+ @Override
+ public boolean tableAlreadyExists(TableIdentifier tableId) {
Review Comment:
I see why it would be difficult to make `tableAlreadyExists()` a method on
`IcebergTable` (as that would require the table to store a pointer back to its
catalog), but why not better encapsulate it, to avoid introducing
`TableIdentifier` into the API? e..g two forms:
* `IcebergCatalog.tableAlreadyExists(String dbName, String tableName)` -
that parallels `openTable`'s signature
* `IcebergCatalog.tableAlreadyExists(IcebergTable table)` - for this, use
your new getter on `IcebergTable.tableId`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]