SourabhBadhya commented on code in PR #4520:
URL: https://github.com/apache/hive/pull/4520#discussion_r1277251858


##########
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:
##########
@@ -1039,7 +1039,7 @@ public static List<String> mergeUniqElems(List<String> 
src, List<String> dest) {
     return src;
   }
 
-  private static final String tmpPrefix = "_tmp.";
+  private static final String tmpPrefix = "-tmp.";
   private static final String taskTmpPrefix = "_task_tmp.";

Review Comment:
   I knew this comment would come :) 
   The problem here is that there was an optimisation which avoids a copy of 
files from _tmp to -ext (HIVE-25990).
   Unfortunately, the merge task was not configured after this optimisation 
hence it never invoked the merge task and it went on with the move task.
   
   To make merge task work on _tmp directory is not possible since Hadoop does 
not allow split generation for hidden paths (like _tmp). Hence I could think of 
only 2 cases - 
   1. Make merge task work by changing this prefix.
   2. Disable the optimisation when there is a merge task present.
   
   I thought of making the first method work and hence I have done this change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org
For additional commands, e-mail: gitbox-h...@hive.apache.org

Reply via email to