steveloughran commented on code in PR #5251:
URL: https://github.com/apache/hadoop/pull/5251#discussion_r1104876447


##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java:
##########
@@ -152,9 +152,18 @@ public void abortJob(JobContext jobContext,
   }
 
   private void cleanupTempFiles(JobContext context) {
-    try {
-      Configuration conf = context.getConfiguration();
+    Configuration conf = context.getConfiguration();
+
+    final boolean directWrite = conf.getBoolean(
+        DistCpOptionSwitch.DIRECT_WRITE.getConfigLabel(), false);
+    final boolean append = conf.getBoolean(
+        DistCpOptionSwitch.APPEND.getConfigLabel(), false);
+    final boolean useTempTarget = !append && !directWrite;
+    if (!useTempTarget) {

Review Comment:
   Reviewing this, I think the cleanup should only be skipped on direct *not* 
append.
   
   why so, append will choose between overwrite and append depending on the 
state of the destination file -a decision made case by case. when doing distcp 
with -append, new files or those whose dest checksum doesn't match that of the 
source for the same #of bytes will use a temp file unless -direct is set.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to