[ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072313#comment-15072313
 ] 

Walter Su commented on HDFS-8430:
---------------------------------

Sorry for late reply.

bq. Ideally, the checksums computed from a EC file and a replicated with the 
same data should be the same.
It's hard to make it done and make it fast at the same time. The reason is well 
explained at HADOOP-3981.

The logic for contiguous file is 
{noformat}
checksum = MD5(
  xxxMD5-of-yyyCRC32(blk0_repl0),  // will try repl1 if repl0 is unavailable.
  xxxMD5-of-yyyCRC32(blk1_repl0),
  xxxMD5-of-yyyCRC32(blk2_repl0),
  ...);
{noformat}
I thought the logic for striped file should be
{noformat}
checksum = MD5(
  xxxMD5-of-yyyCRC32(bg0_idx0),
  xxxMD5-of-yyyCRC32(bg0_idx1),
  xxxMD5-of-yyyCRC32(bg0_idx2),   // skip parity blocks
  ...
  xxxMD5-of-yyyCRC32(bg1_idx0),
  xxxMD5-of-yyyCRC32(bg1_idx1),
  xxxMD5-of-yyyCRC32(bg1_idx2),
  ...);
{noformat}
But it's very likely some internal block is missing. We could 
1. throw retry-able Exception, so caller can try again until the missing blocks 
recovered. DistCp can wait and retry, or ignore and copy the file (could be 
very big).
2. read some internal blocks and decode the missing blocks, then we get the 
checksum. 
3. HADOOP-3981 has a discussion about on-demand computation vs pre-computation. 
We can consider the latter, we need a new design.

I really appreciate some advises/discussions.

> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -------------------------------------------------------------------------
>
>                 Key: HDFS-8430
>                 URL: https://issues.apache.org/jira/browse/HDFS-8430
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Walter Su
>            Assignee: Walter Su
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to