[ 
https://issues.apache.org/jira/browse/HDFS-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007288#comment-13007288
 ] 

Bharath Mundlapudi commented on HDFS-1362:
------------------------------------------

I have made some changes in Hadoop 0.20 version in making datanode more 
reliable w.r.t to disk failures. Please refer to umbrella Jira HADOOP-7123 and 
specifically HADOOP-7125 for datanode. This particular patch supplements 
HADOOP-7125 Jira. I will be porting these changes to trunk soon. 

I have couple of comments regarding this patch:
1. When we add a new volume, should we do appropriate math to validVolsRequired 
member in FSDataSet? 
2. Typo: revoverTransitionAdditionalRead instead of 
recoverTransitionAdditionalRead?
3. Is there a way to separate out common code from recoverTransitionRead and 
recoverTransitionAdditionalRead? Seems like most of the code is common in these 
two methods.


> Provide volume management functionality for DataNode
> ----------------------------------------------------
>
>                 Key: HDFS-1362
>                 URL: https://issues.apache.org/jira/browse/HDFS-1362
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node
>    Affects Versions: 0.23.0
>            Reporter: Wang Xu
>            Assignee: Wang Xu
>             Fix For: 0.23.0
>
>         Attachments: DataNode Volume Refreshment in HDFS-1362.pdf, 
> HDFS-1362.4_w7001.txt, HDFS-1362.5.patch, HDFS-1362.6.patch, 
> HDFS-1362.7.patch, HDFS-1362.txt, Provide_volume_management_for_DN_v1.pdf
>
>
> The current management unit in Hadoop is a node, i.e. if a node failed, it 
> will be kicked out and all the data on the node will be replicated.
> As almost all SATA controller support hotplug, we add a new command line 
> interface to datanode, thus it can list, add or remove a volume online, which 
> means we can change a disk without node decommission. Moreover, if the failed 
> disk still readable and the node has enouth space, it can migrate data on the 
> disks to other disks in the same node.
> A more detailed design document will be attached.
> The original version in our lab is implemented against 0.20 datanode 
> directly, and is it better to implemented it in contrib? Or any other 
> suggestion?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to