I think decommissioning the node and replacing the disk is a cleaner
approach. That's what I'd recommend doing as well..

On 9/10/09, Alex Loddengaard <[email protected]> wrote:
> Hi David,
> Unfortunately there's really no way to do what you're hoping to do in an
> automatic way.  You can move the block files (including their .meta files)
> from one disk to another.  Do this when the datanode daemon is stopped.
>  Then, when you start the datanode daemon, it will scan dfs.data.dir and be
> totally happy if blocks have moved hard drives.  I've never tried to do this
> myself, but others on the list have suggested this technique for "balancing
> disks."
>
> You could also change your process around a little.  It's not too crazy to
> decommission an entire node, replace one of its disks, then bring it back
> into the cluster.  Seems to me that this is a much saner approach: your ops
> team will tell you which disk needs replacing.  You decommission the node,
> they replace the disk, you add the node back to the pool.  Your call I
> guess, though.
>
> Hope this was helpful.
>
> Alex
>
> On Thu, Sep 10, 2009 at 6:30 PM, David B. Ritch
> <[email protected]>wrote:
>
>> What do you do with the data on a failing disk when you replace it?
>>
>> Our support person comes in occasionally, and often replaces several
>> disks when he does.  These are disks that have not yet failed, but
>> firmware indicates that failure is imminent.  We need to be able to
>> migrate our data off these disks before replacing them.  If we were
>> replacing entire servers, we would decommission them - but we have 3
>> data disks per server.  If we were replacing one disk at a time, we
>> wouldn't worry about it (because of redundancy).  We can decommission
>> the servers, but moving all the data off of all their disks is a waste.
>>
>> What's the best way to handle this?
>>
>> Thanks!
>>
>> David
>>
>


-- 


Amandeep Khurana
Computer Science Graduate Student
University of California, Santa Cruz

Reply via email to