Need a distributed file checksum algorithm for HDFS
---------------------------------------------------
Key: HADOOP-3981
URL: https://issues.apache.org/jira/browse/HADOOP-3981
Project: Hadoop Core
Issue Type: New Feature
Components: dfs
Reporter: Tsz Wo (Nicholas), SZE
Traditional message digest algorithms, like MD5, SHA1, etc., require reading
the entire input message sequentially in a central location. HDFS supports
large files with multiple tera bytes. The overhead of reading the entire file
is huge. A distributed file checksum algorithm is needed for HDFS.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.