[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

Kai Zheng (JIRA) Mon, 04 Jan 2016 03:49:28 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15081045#comment-15081045
 ]


Kai Zheng commented on HDFS-8430:
---------------------------------

See. So the cell level checksum instead of the block level one would produce 
even more intermediate checksum data. Either way, as Walter said above, we 
would still need to get all the cell/block checksum results in hand so sum them 
in the desired order even the used code is linear to generate the same final 
result. You mentioned we can consider to use a DataNode for the final result 
computing, but it looks like a little overkill for this functionality, or I 
missed anything here? Is there any existing facility like this to support doing 
the way? The client instructs other 5 DataNodes to send some data to the lead 
DataNode and the lead aggregates the result to respond the client's request. 
Thanks.


> Erasure coding: update DFSClient.getFileChecksum() logic for stripe files
> -------------------------------------------------------------------------
>
>                 Key: HDFS-8430
>                 URL: https://issues.apache.org/jira/browse/HDFS-8430
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Walter Su
>            Assignee: Kai Zheng
>         Attachments: HDFS-8430-poc1.patch
>
>
> HADOOP-3981 introduces a  distributed file checksum algorithm. It's designed 
> for replicated block.
> {{DFSClient.getFileChecksum()}} need some updates, so it can work for striped 
> block group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8430) Erasure coding: update DFSClient.getFileChecksum() logic for stripe files

Reply via email to