[
https://issues.apache.org/jira/browse/GOBBLIN-1709?focusedWorklogId=810963&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-810963
]
ASF GitHub Bot logged work on GOBBLIN-1709:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 21/Sep/22 23:32
Start Date: 21/Sep/22 23:32
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3560:
URL: https://github.com/apache/gobblin/pull/3560#discussion_r977062551
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDatasetFinder.java:
##########
@@ -40,18 +39,20 @@
* and creates a {@link IcebergDataset} for each one.
*/
@Slf4j
-@AllArgsConstructor
public class IcebergDatasetFinder implements
IterableDatasetFinder<IcebergDataset> {
public static final String ICEBERG_DATASET_PREFIX =
DatasetConstants.PLATFORM_ICEBERG + ".dataset";
public static final String ICEBERG_HIVE_CATALOG_METASTORE_URI_KEY =
ICEBERG_DATASET_PREFIX + ".hive.metastore.uri";
public static final String ICEBERG_DB_NAME = ICEBERG_DATASET_PREFIX +
".database.name";
public static final String ICEBERG_TABLE_NAME = ICEBERG_DATASET_PREFIX +
".table.name";
- private String dbName;
- private String tblName;
private final Properties properties;
- protected final FileSystem fs;
+ protected final FileSystem sourceFs;
+
+ public IcebergDatasetFinder(FileSystem fs, Properties properties) {
Review Comment:
note: equivalent to `@lombok.RequiredArgsConstructor`
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergDataset.java:
##########
@@ -185,31 +185,28 @@ protected Map<Path, FileStatus>
getFilePathsToFileStatus() throws IOException {
for (String pathString : pathsToCopy) {
Path path = new Path(pathString);
- result.put(path, this.fs.getFileStatus(path));
+ result.put(path, this.sourceFs.getFileStatus(path));
}
return result;
}
- DatasetDescriptor getSourceDataset() {
- return getDatasetDescriptor(sourceMetastoreURI);
+ DatasetDescriptor getSourceDataset(FileSystem sourceFs) {
+ return getDatasetDescriptor(sourceMetastoreURI, sourceFs);
}
- DatasetDescriptor getDestinationDataset() {
- return getDatasetDescriptor(targetMetastoreURI);
+ DatasetDescriptor getDestinationDataset(FileSystem targetFs) {
Review Comment:
shouldn't these either be `@VisibleForTesting` or `protected`?
Issue Time Tracking
-------------------
Worklog Id: (was: 810963)
Time Spent: 11h 20m (was: 11h 10m)
> Create work units for Hive Catalog based Iceberg Datasets to support Distcp
> for Iceberg
> ---------------------------------------------------------------------------------------
>
> Key: GOBBLIN-1709
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1709
> Project: Apache Gobblin
> Issue Type: New Feature
> Components: distcp-ng
> Reporter: Meeth Gala
> Assignee: Issac Buenrostro
> Priority: Major
> Time Spent: 11h 20m
> Remaining Estimate: 0h
>
> We want to support Distcp for Iceberg based datasets.
> As a pilot, we are starting with Hive Catalog and will expand the
> functionality to cover all Iceberg based datasets.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)