[
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15080938#comment-15080938
]
Kai Zheng commented on HDFS-8430:
---------------------------------
Thanks for the clarifying in details.
bq. It sounds good. We may add a parameter to the getFileChecksum(..) methods
for passing the algorithm name.
Thanks for the confirm. So in the algorithm name we would need to put {{cell}}
value into it, or it can be deduced from the algorithm? Another way is to pass
cell size value instead if we don't have other info to pass, like the algorithm
is fixed at all.
bq. cell checksum (MD5, CRC64 or other code): use all the CRC32s of a cell to
compute a checksum
This is what I was and am yet not clear about. I guess you mean strip when
saying cell, because for a cell, we would have only one CRC32.
> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -------------------------------------------------------------------------
>
> Key: HDFS-8430
> URL: https://issues.apache.org/jira/browse/HDFS-8430
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Affects Versions: HDFS-7285
> Reporter: Walter Su
> Assignee: Kai Zheng
> Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a distributed file checksum algorithm. It's designed
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped
> block group.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)