[jira] [Updated] (HDFS-7369) Erasure coding: distribute recovery work for striped blocks to DataNode

Zhe Zhang (JIRA) Wed, 18 Mar 2015 14:56:20 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-7369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Zhe Zhang updated HDFS-7369:
----------------------------
    Attachment: HDFS-7369-004.patch

Thanks Jing very much for the insightful review! The new patch addresses most 
of the comments.

bq. DN needs to know these information for recovery: 1) block ID and block pool 
ID, 2) new generation stamp, 3) DNs with healthy blocks and their corresponding 
index, 4) target DNs and storages, and 5) EC schema information. Looks like we 
currently do not have 2) and 5).
This is a great summary. I think we need to think more about GS. Since the 
current plan is to use the same GS for the entire group, I guess we need to 
inform the new GS to healthy blocks in the group as well? ([~szetszwo] 
documented some ideas we discussed on page 12 of the [design doc | 
https://issues.apache.org/jira/secure/attachment/12697210/HDFSErasureCodingDesign-20150206.pdf]).
 Should we handle that process, as well as the new GS logic, in a separate JIRA?

Right now we have an {{ECSchema}} class from HADOOP-11643; I guess we can add 
an {{ECSchema}} instance to the DN command after we finalize the discussion on 
how to store the EC schema and policy.

bq. In BlockECRecoveryCommand, can we use DatanodeStorageInfo directly instead 
of separating it into DatanodeInfo, StorageType, and storage ID.
Good idea. I referred to {{BlockCommand}} code when developing this class. Do 
you know why {{BlockCommand}} uses separate types instead of 
{{DatanodeStorageInfo}}, or even simpler, directly using {{BlockTargetPair}}?

bq. srcNodes and targets cannot be resolved
My IDE actually resolves them in the javadoc. What could be the issue? If 
necessary I can remove the {{\@link}}

Note that the new {{missingBlockIndices}} logic in {{chooseSourceDatanodes}} is 
not tested yet. Right now it's not easy to emulate a lost striped block. 
{{TestRecoverStripedBlocks}} is using a workaround (writing contiguous blocks 
first and then setting the EC policy).

> Erasure coding: distribute recovery work for striped blocks to DataNode
> -----------------------------------------------------------------------
>
>                 Key: HDFS-7369
>                 URL: https://issues.apache.org/jira/browse/HDFS-7369
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7369-000-part1.patch, HDFS-7369-000-part2.patch, 
> HDFS-7369-001.patch, HDFS-7369-002.patch, HDFS-7369-003.patch, 
> HDFS-7369-004.patch
>
>
> This JIRA updates NameNode to handle background / offline recovery of erasure 
> coded blocks. It includes 2 parts:
> # Extend {{UnderReplicatedBlocks}} to recognize EC blocks and insert them to 
> appropriate priority levels. 
> # Update {{ReplicationMonitor}} to distinguish block codec tasks and send a 
> new DataNode command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7369) Erasure coding: distribute recovery work for striped blocks to DataNode

Reply via email to