meethngala commented on code in PR #3663:
URL: https://github.com/apache/gobblin/pull/3663#discussion_r1154756226
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDatasetFinder.java:
##########
@@ -98,20 +104,43 @@ public Iterator<IcebergDataset> getDatasetsIterator()
throws IOException {
return findDatasets().iterator();
}
- protected IcebergDataset createIcebergDataset(String dbName, String tblName,
IcebergCatalog icebergCatalog, Properties properties, FileSystem fs) {
- IcebergTable icebergTable = icebergCatalog.openTable(dbName, tblName);
- return new IcebergDataset(dbName, tblName, icebergTable, properties, fs);
+ /**
+ * Requires both source and destination catalogs to connect to their
respective {@link IcebergTable}
+ * Note: the destination side {@link IcebergTable} should be present before
initiating replication
Review Comment:
It was conscious decision made based on the idea that we only register table
metadata updates for existing tables. This was to keep in mind for sinks that
serve as pure secondary reads for disaster recovery purposes. I have added a
TODO in case we want to rethink on the approach and allow creating tables as
well in case they are absent
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]