[
https://issues.apache.org/jira/browse/HDFS-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13007288#comment-13007288
]
Bharath Mundlapudi commented on HDFS-1362:
------------------------------------------
I have made some changes in Hadoop 0.20 version in making datanode more
reliable w.r.t to disk failures. Please refer to umbrella Jira HADOOP-7123 and
specifically HADOOP-7125 for datanode. This particular patch supplements
HADOOP-7125 Jira. I will be porting these changes to trunk soon.
I have couple of comments regarding this patch:
1. When we add a new volume, should we do appropriate math to validVolsRequired
member in FSDataSet?
2. Typo: revoverTransitionAdditionalRead instead of
recoverTransitionAdditionalRead?
3. Is there a way to separate out common code from recoverTransitionRead and
recoverTransitionAdditionalRead? Seems like most of the code is common in these
two methods.
> Provide volume management functionality for DataNode
> ----------------------------------------------------
>
> Key: HDFS-1362
> URL: https://issues.apache.org/jira/browse/HDFS-1362
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: data-node
> Affects Versions: 0.23.0
> Reporter: Wang Xu
> Assignee: Wang Xu
> Fix For: 0.23.0
>
> Attachments: DataNode Volume Refreshment in HDFS-1362.pdf,
> HDFS-1362.4_w7001.txt, HDFS-1362.5.patch, HDFS-1362.6.patch,
> HDFS-1362.7.patch, HDFS-1362.txt, Provide_volume_management_for_DN_v1.pdf
>
>
> The current management unit in Hadoop is a node, i.e. if a node failed, it
> will be kicked out and all the data on the node will be replicated.
> As almost all SATA controller support hotplug, we add a new command line
> interface to datanode, thus it can list, add or remove a volume online, which
> means we can change a disk without node decommission. Moreover, if the failed
> disk still readable and the node has enouth space, it can migrate data on the
> disks to other disks in the same node.
> A more detailed design document will be attached.
> The original version in our lab is implemented against 0.20 datanode
> directly, and is it better to implemented it in contrib? Or any other
> suggestion?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira