steveloughran commented on code in PR #5603:
URL: https://github.com/apache/hadoop/pull/5603#discussion_r1181517673


##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java:
##########
@@ -592,10 +592,14 @@ public static CopyMapper.ChecksumComparison 
checksumsAreEqual(
     // comparison that took place and return not compatible.
     // else if matched, return compatible with the matched result.
     if (sourceChecksum == null || targetChecksum == null) {
+      LOG.error("Checksum incompatible. Source checksum: {}, target checksum: 
{}",

Review Comment:
   this is not unusual against object stores, as all stores which don't have 
hdfs-+compatible checksums disable them so that distcp to cloud stores don't 
blow up. If you logged at error then there'd be an entry for every single copy.
   
   propose: use a LogExactlyOnce at info to day source or target fs doesn't 
support checksums and that they should use -skipCrc 



##########
hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/TestCopyCommitter.java:
##########
@@ -569,6 +570,7 @@ private void testCommitWithChecksumMismatch(boolean skipCrc)
                 fs, new Path(sourceBase + srcFilename), null,
                 fs, new Path(targetBase + srcFilename),
                 sourceCurrStatus.getLen()));
+        assertTrue(log.getOutput().contains("Checksum not equal"));

Review Comment:
   asserts to use Assert's assertThat(log.getOutput).contains(...) for better 
message
   



##########
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java:
##########
@@ -592,10 +592,14 @@ public static CopyMapper.ChecksumComparison 
checksumsAreEqual(
     // comparison that took place and return not compatible.
     // else if matched, return compatible with the matched result.
     if (sourceChecksum == null || targetChecksum == null) {
+      LOG.error("Checksum incompatible. Source checksum: {}, target checksum: 
{}",
+          sourceChecksum, targetChecksum);
       return CopyMapper.ChecksumComparison.INCOMPATIBLE;
     } else if (sourceChecksum.equals(targetChecksum)) {
       return CopyMapper.ChecksumComparison.TRUE;
     }
+    LOG.error("Checksum not equal. Source checksum: {}, target checksum: {}",
+        sourceChecksum, targetChecksum);

Review Comment:
   log checksums at debug maybe
   
   anyway, doesn't a checksum mismatch mean "source file needs copying again". 
so really the thing to log is not the mismatch but why the copy is taking place



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to