I've raised this as an issue: https://issues.apache.org/jira/browse/HDFS-10338
On Wednesday, 27 April 2016, Elliot West <[email protected]> wrote: > Hello, > > We are using DistCp V2 to replicate data between two HDFS file systems. We > were working on the assumption that we could rely on CRC checks to ensure > that the data was replicated correctly. However, after examining the DistCp > source code it seems that there are edge cases where the CRCs could differ > and yet the copy succeeds even when we are not skipping CRC checks. > > I'm wondering whether this is by design and if so, the reasoning behind > it? If this is a bug, I'd like to raise an issue to fix it. If it is by > design, I'd like to propose the introduction an option for stricter CRC > checks. > > The code in question is contained in the method: > > org.apache.hadoop.tools.util.DistCpUtils#checksumsAreEqual(...) > > which can be seen here: > > > https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java#L457 > > > Specifically this code block suggests that if there is a failure when > trying to read the source or target checksum then the method will return > 'true', implying that the check succeeded. In actual fact we just failed to > obtain the checksum and could perform no check. > > try { > sourceChecksum = sourceChecksum != null ? sourceChecksum : sourceFS > .getFileChecksum(source); > targetChecksum = targetFS.getFileChecksum(target); > } catch (IOException e) { > LOG.error("Unable to retrieve checksum for " + source + " or " + > target, e); > } > return (sourceChecksum == null || targetChecksum == null || > sourceChecksum.equals(targetChecksum)); > > Ideally I'd like to be able to configure a check where we require that > both the source and target CRCs are retrieved and compared, and if for any > reason either of the CRCs retrievals fail then an exception is thrown. I do > appreciate that some FileSystems cannot return CRCs but these could still > be handled correctly as they would simply return null and not throw an > exception (I assume). > > I'd appreciate any thoughts on this matter. > > Elliot. >
