Gera Shegalov created HADOOP-12326:
--------------------------------------
Summary: Implement ChecksumFileSystem#getFileChecksum equivalent
to HDFS for easy check
Key: HADOOP-12326
URL: https://issues.apache.org/jira/browse/HADOOP-12326
Project: Hadoop Common
Issue Type: Improvement
Components: fs
Affects Versions: 2.7.1
Reporter: Gera Shegalov
Assignee: Gera Shegalov
If we have same-content files, one local and one remotely on HDFS (after
downloading or uploading), getFileChecksum can provide a quick check whether
they are consistent. To this end, we can switch to CRC32C on local filesystem.
The difference in block sizes does not matter, because for the local filesystem
it's just a logical parameter.
{code}
$ hadoop fs -Dfs.local.block.size=134217728 -checksum file:${PWD}/part-m-00000
part-m-00000
15/08/15 13:30:02 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
file:///Users/gshegalov/workspace/hadoop-common/part-m-00000
MD5-of-262144MD5-of-512CRC32C
000002000000000000040000e84fb07f8c9d4ef3acb5d1983a7e2a68
part-m-00000 MD5-of-262144MD5-of-512CRC32C
000002000000000000040000e84fb07f8c9d4ef3acb5d1983a7e2a68
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)