[ 
https://issues.apache.org/jira/browse/HIVE-24936?focusedWorklogId=599722&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-599722
 ]

ASF GitHub Bot logged work on HIVE-24936:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/May/21 10:31
            Start Date: 20/May/21 10:31
    Worklog Time Spent: 10m 
      Work Description: harishjp commented on a change in pull request #2120:
URL: https://github.com/apache/hive/pull/2120#discussion_r635977406



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
##########
@@ -1093,40 +1093,69 @@ public static void rename(FileSystem fs, Path src, Path 
dst) throws IOException,
     }
   }
 
-  private static void moveFile(FileSystem fs, FileStatus file, Path dst) 
throws IOException,
+  private static void moveFileOrDir(FileSystem fs, FileStatus file, Path dst) 
throws IOException,

Review comment:
       Test renameOrMove files calls this, and is the public API. Tests for 
that should cover cases here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 599722)
    Time Spent: 1h  (was: 50m)

> Fix file name parsing and copy file move.
> -----------------------------------------
>
>                 Key: HIVE-24936
>                 URL: https://issues.apache.org/jira/browse/HIVE-24936
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>            Reporter: Harish JP
>            Assignee: Harish JP
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> The taskId and taskAttemptId is not extracted correctly for copy files 
> (00001_02_copy_3) and when doing a move file of an incompatible copy file the 
> rename utility generates wrong file names. Ex: 00001_02_copy_3 is renamed to 
> 00001_02_copy_3_1 if 00001_02_copy_3 already exists, ideally it should be 
> 00001_02_copy_N.
>  
> Incompatible files should be always renamed using the current task or it can 
> get deleted if the file name conflicts with another task output file. Ex: if 
> the input file name for a task is 00005_01 and is incompatible then if we 
> move this file, it will be treated as an output file for task id 5, attempt 1 
> which if exists will try to generate the same file and fail and another 
> attempt will be made. There will be 2 files 00005_01, 00005_02, the deduping 
> code will remove 00005_01 resulting in data loss. There are other scenarios 
> where the same can happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to