SourabhBadhya commented on code in PR #4520: URL: https://github.com/apache/hive/pull/4520#discussion_r1277251858
########## ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: ########## @@ -1039,7 +1039,7 @@ public static List<String> mergeUniqElems(List<String> src, List<String> dest) { return src; } - private static final String tmpPrefix = "_tmp."; + private static final String tmpPrefix = "-tmp."; private static final String taskTmpPrefix = "_task_tmp."; Review Comment: I knew this comment would come :) The problem here is that there was an optimisation which avoids a copy of files from _tmp to -ext (HIVE-25990). Unfortunately, the merge task was not configured after this optimisation hence it never invoked the merge task and it went on with the move task. To make merge task work on _tmp directory is not possible since Hadoop does not allow split generation for hidden paths (like _tmp). Hence I could think of only 2 cases - 1. Make merge task work by changing this prefix. 2. Disable the optimisation when there is a merge task present. I thought of making the first method work and hence I have done this change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org