[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

Rakesh R (JIRA) Wed, 25 May 2016 05:55:37 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-9833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299991#comment-15299991
 ]


Rakesh R commented on HDFS-9833:
--------------------------------

Thanks a lot [~drankye] for reviewing the patch and offline discussions. I've 
uploaded new patch addressing the comments except 2nd point. Also, added new 
test case to verifiy checksum after node decommissioning(block locations will 
be duplicated after decommn operation).

bq. HashMap here might be little heavy, an array should work instead.
Checksum logic is using {{namenode.getBlockLocations(src, start, length)}} to 
get the block locations. This list is not guaranteeing any order and also list 
contains duplicated block info(index and its source node). Now, while computing 
the block checksum it needs to skip the block which is already considered 
previously. With {{HashMap}} all these cases will be handled internally(removes 
duplicate index and maintains ascending order), I feel this makes the logic 
simple. Also, this hashmap is used locally and contains only very few entries. 
With array, we need to add extra logic to skip the duplicate nodes and may need 
to add sorting logic. Whats your opinion to use existing hashmap?

Below is sample block indices list after the decommissioning operation. {{'}} 
represents decommissioned node index. Here, this list contains duplicated 
blocks and not maintaining any order.
{code}
0, 2, 3, 4, 5, 6, 7, 8, 1, 1'
{code}

> Erasure coding: recomputing block checksum on the fly by reconstructing the 
> missed/corrupt block data
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-9833
>                 URL: https://issues.apache.org/jira/browse/HDFS-9833
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kai Zheng
>            Assignee: Rakesh R
>              Labels: hdfs-ec-3.0-must-do
>         Attachments: HDFS-9833-00-draft.patch, HDFS-9833-01.patch, 
> HDFS-9833-02.patch, HDFS-9833-03.patch, HDFS-9833-04.patch, HDFS-9833-05.patch
>
>
> As discussed in HDFS-8430 and HDFS-9694, to compute striped file checksum 
> even some of striped blocks are missed, we need to consider recomputing block 
> checksum on the fly for the missed/corrupt blocks. To recompute the block 
> checksum, the block data needs to be reconstructed by erasure decoding, and 
> the main needed codes for the block reconstruction could be borrowed from 
> HDFS-9719, the refactoring of the existing {{ErasureCodingWorker}}. In EC 
> worker, reconstructed blocks need to be written out to target datanodes, but 
> here in this case, the remote writing isn't necessary, as the reconstructed 
> block data is only used to recompute the checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-9833) Erasure coding: recomputing block checksum on the fly by reconstructing the missed/corrupt block data

Reply via email to