[ 
https://issues.apache.org/jira/browse/GOBBLIN-1222?focusedWorklogId=468011&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-468011
 ]

ASF GitHub Bot logged work on GOBBLIN-1222:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Aug/20 20:47
            Start Date: 07/Aug/20 20:47
    Worklog Time Spent: 10m 
      Work Description: autumnust commented on a change in pull request #3070:
URL: https://github.com/apache/incubator-gobblin/pull/3070#discussion_r467259277



##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/CopySource.java
##########
@@ -195,6 +199,8 @@
           datasetFinder instanceof IterableDatasetFinder ? 
(IterableDatasetFinder<CopyableDatasetBase>) datasetFinder
               : new IterableDatasetFinderImpl<>(datasetFinder);
 
+

Review comment:
       Please remove the additional blank lines.

##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/hive/HiveDataset.java
##########
@@ -125,6 +128,8 @@ public HiveDataset(FileSystem fs, HiveMetastoreClientPool 
clientPool, Table tabl
         Optional.fromNullable(this.table.getDataLocation());
 
     this.tableIdentifier = this.table.getDbName() + "." + 
this.table.getTableName();
+    this.datasetStagingDir = 
properties.getProperty(DATASET_PREFIX_TOBEREPLACED) + "/" + 
this.table.getDbName() + "/" + this.table.getTableName();

Review comment:
       This one will not preserve the casing for the HDFS path. 

##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/hive/HiveDataset.java
##########
@@ -85,6 +85,8 @@
   public static final String DATASET_NAME_PATTERN_KEY = 
"hive.datasetNamePattern";
   public static final String DATABASE = "Database";
   public static final String TABLE = "Table";
+  public static final String DATASET_STAGING_PATH = "dataset.staging.path";
+  public static final String DATASET_PREFIX_TOBEREPLACED = 
"hive.dataset.copy.target.table.prefixToBeReplaced";

Review comment:
       Isn't the same thing already defined somewhere else ? 

##########
File path: 
gobblin-utility/src/main/java/org/apache/gobblin/util/WriterUtils.java
##########
@@ -100,6 +111,10 @@ public static Path getWriterStagingDir(State state, int 
numBranches, int branchI
     return new Path(getWriterStagingDir(state, numBranches, branchId), 
attemptId);
   }
 
+  public static Path getHiveDatasetWriterStagingDir(State state, int 
numBranches, int branchId, String attemptId) {

Review comment:
       why do you need this method given there's another 
`getHiveDatasetWriterStagingDir`? 

##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/hive/HiveDataset.java
##########
@@ -330,4 +335,8 @@ private boolean canCopyTable() {
     }
     return true;
   }
+
+  public Properties getProperties() {

Review comment:
       You can use lombok annotation to get rid of this method. Check "@Getter" 
annotation in the code base. 

##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/CopySource.java
##########
@@ -366,6 +372,9 @@ public Void call() {
               workUnit.setProp(ConfigurationKeys.COPY_EXPECTED_SCHEMA, 
((ConfigBasedDataset) this.copyableDataset).getExpectedSchema());
             }
           }
+          if ((this.copyableDataset instanceof HiveDataset) && 
(state.getPropAsBoolean(ConfigurationKeys.DATASET_STAGING_DIR,false))) {

Review comment:
       Sorry If we have discussed this before, but what's the reason that you 
couldn't set `DATASET_STAGING_DIR_PATH` in the Hive-specific constructs like 
`HiveDataset`? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 468011)
    Time Spent: 0.5h  (was: 20m)

> Create right abstraction to assemble dataset staging dir for Hive dataset 
> finder
> --------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1222
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1222
>             Project: Apache Gobblin
>          Issue Type: Bug
>            Reporter: Vaibhav Arya
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to