[ 
https://issues.apache.org/jira/browse/GOBBLIN-1602?focusedWorklogId=722199&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-722199
 ]

ASF GitHub Bot logged work on GOBBLIN-1602:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 07/Feb/22 19:15
            Start Date: 07/Feb/22 19:15
    Worklog Time Spent: 10m 
      Work Description: ZihanLi58 commented on a change in pull request #3459:
URL: https://github.com/apache/gobblin/pull/3459#discussion_r800968438



##########
File path: .github/workflows/build_and_test.yaml
##########
@@ -124,6 +124,7 @@ jobs:
           echo -e "$(ip addr show eth0 | grep "inet\b" | awk '{print $2}' | 
cut -d/ -f1)\t$(hostname -f) $(hostname -s)" | sudo tee -a /etc/hosts
       - name: Verify mysql connection
         run: |
+            sudo apt-get --fix-missing update

Review comment:
       This one is belonging to another PR?

##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/hive/HiveUtils.java
##########
@@ -167,4 +169,19 @@ private static Configuration getHadoopConfiguration() {
   public static boolean isPartitioned(Table table) {
     return table.isPartitioned();
   }
+
+  /**
+   * @param fs User configured filesystem of the target table
+   * @param userSpecifiedPath user specified path of the copy table location 
or partition
+   * @param existingTablePath path of an already registered Hive table or 
partition
+   * @return true if the filesystem resolves them to be equivalent, false 
otherwise
+   */
+  public static boolean areTablePathsEquivalent(FileSystem fs, Path 
userSpecifiedPath, Path existingTablePath) throws IOException {
+    try {
+      return 
fs.resolvePath(existingTablePath).equals(fs.resolvePath(userSpecifiedPath));

Review comment:
       So after this change, we require the fs to be virtual fileSystem?

##########
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/hive/HivePartitionFileSet.java
##########
@@ -99,7 +99,7 @@ public HivePartitionFileSet(HiveCopyEntityHelper 
hiveCopyEntityHelper, Partition
               hiveCopyEntityHelper.getExistingEntityPolicy() != 
HiveCopyEntityHelper.ExistingEntityPolicy.REPLACE_TABLE_AND_PARTITIONS) {
             log.error("Source and target partitions are not compatible. 
Aborting copy of partition " + this.partition,
                 ioe);
-            return Lists.newArrayList();
+            throw ioe;

Review comment:
       Just wondering if we trace up the code, is there a way for us to control 
whether this exception will fail the whole job or not? Should we just collect 
the exception without failing the job and at the end of the job, using this 
info to determine whether we want to fail the job or not? (This can be in 
another PR though) 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 722199)
    Time Spent: 1h 40m  (was: 1.5h)

> Handle hive table mismatch when paths are equivalent in the underlying FS
> -------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1602
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1602
>             Project: Apache Gobblin
>          Issue Type: Task
>          Components: gobblin-core
>            Reporter: William Lo
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In scenarios where the paths are equivalent in the underlying FS, hive copy 
> should not treat these paths separately if the user provided URI does not 
> match the hive registered URI



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to