[
https://issues.apache.org/jira/browse/HDFS-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025179#comment-13025179
]
Steve Loughran commented on HDFS-664:
-------------------------------------
tony, this could be a good start. Thanks!
# Has anyone who really understands HDFS looked at this?
# I'd replace the assert statements with checks at all times
# In registerVolume() I'd include the exception as its own parameter in Log,
and nest it in the throw, so that the stack trace gets logged and retained.
# I don't see the DatanodeShell class in the patch
# is there a github repository where I could pull this branch in from? That way
I could try merging it into trunk
# I'm thinking about how to test this...I could imagine something with VMs
where we try swapping a volume
# Sanjay Rajiva has said that the current DN design can't handle disk failure
perfectly -something to do with threads, though I don't know the relevant HDFS
issue. We need to be sure that that problem is handled first, then worry about
this addition.
> Add a way to efficiently replace a disk in a live datanode
> ----------------------------------------------------------
>
> Key: HDFS-664
> URL: https://issues.apache.org/jira/browse/HDFS-664
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: data-node
> Affects Versions: 0.22.0
> Reporter: Steve Loughran
> Attachments: HDFS-664.patch
>
>
> In clusters where the datanode disks are hot swappable, you need to be able
> to swap out a disk on a live datanode without taking down the datanode. You
> don't want to decommission the whole node as that is overkill. on a system
> with 4 1TB HDDs, giving 3 TB of datanode storage, a decommissioning and
> restart will consume up to 6 TB of bandwidth. If a single disk were swapped
> in then there would only be 1TB of data to recover over the network. More
> importantly, if that data could be moved to free space on the same machine,
> the recommissioning could take place at disk rates, not network speeds.
> # Maybe have a way of decommissioning a single disk on the DN; the files
> could be moved to space on the other disks or the other machines in the rack.
> # There may not be time to use that option, in which case pulling out the
> disk would be done with no warning, a new disk inserted.
> # The DN needs to see that a disk has been replaced (or react to some ops
> request telling it this), and start using the new disk again -pushing back
> data, rebuilding the balance.
> To complicate the process, assume there is a live TT on the system, running
> jobs against the data. The TT would probably need to be paused while the work
> takes place, any ongoing work handled somehow. Halting the TT and then
> restarting it after the replacement disk went in is probably simplest.
> The more disks you add to a node, the more this scenario becomes a need.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira