I think decommissioning the node and replacing the disk is a cleaner approach. That's what I'd recommend doing as well..
On 9/10/09, Alex Loddengaard <[email protected]> wrote: > Hi David, > Unfortunately there's really no way to do what you're hoping to do in an > automatic way. You can move the block files (including their .meta files) > from one disk to another. Do this when the datanode daemon is stopped. > Then, when you start the datanode daemon, it will scan dfs.data.dir and be > totally happy if blocks have moved hard drives. I've never tried to do this > myself, but others on the list have suggested this technique for "balancing > disks." > > You could also change your process around a little. It's not too crazy to > decommission an entire node, replace one of its disks, then bring it back > into the cluster. Seems to me that this is a much saner approach: your ops > team will tell you which disk needs replacing. You decommission the node, > they replace the disk, you add the node back to the pool. Your call I > guess, though. > > Hope this was helpful. > > Alex > > On Thu, Sep 10, 2009 at 6:30 PM, David B. Ritch > <[email protected]>wrote: > >> What do you do with the data on a failing disk when you replace it? >> >> Our support person comes in occasionally, and often replaces several >> disks when he does. These are disks that have not yet failed, but >> firmware indicates that failure is imminent. We need to be able to >> migrate our data off these disks before replacing them. If we were >> replacing entire servers, we would decommission them - but we have 3 >> data disks per server. If we were replacing one disk at a time, we >> wouldn't worry about it (because of redundancy). We can decommission >> the servers, but moving all the data off of all their disks is a waste. >> >> What's the best way to handle this? >> >> Thanks! >> >> David >> > -- Amandeep Khurana Computer Science Graduate Student University of California, Santa Cruz
