SourabhBadhya commented on code in PR #4520: URL: https://github.com/apache/hive/pull/4520#discussion_r1298406560
########## ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: ########## @@ -1039,7 +1039,7 @@ public static List<String> mergeUniqElems(List<String> src, List<String> dest) { return src; } - private static final String tmpPrefix = "_tmp."; + private static final String tmpPrefix = "-tmp."; private static final String taskTmpPrefix = "_task_tmp."; Review Comment: Thanks for the comments @dengzhhu653 > I'm wondering if there is a case: the Utilities.java finalising the files before all the Tez tasks finishes: though the job is marked as success, but some Tez attempts are slowly to terminate due to heavy GC, or load, etc. The Utilities.java has kept all valid files into filesKept, I would suggest reading them from the manifest when resolving the merge task. Done. Used manifest file info itself as the input for merge task. > For CTAS with union select operation, actually should be union all, only union all will change the temp directory to -ext-10000/_tmp.HIVE_UNION_SUBDIR_1/dynamic_partition-dir/HIVE_UNION_SUBDIR_1 Added tests for ORC and non-ORC formats with UNION ALL clause, interestingly there is no manifest file generated when there is UNION ALL hence it does not go through the code flow written in this patch. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org