What would happen if you did this without taking the node down?  For
example, if you have hot-swappable drives in the node(s)?  Will the
running datanode process pick up the fact that an entire partition goes
missing and reappears empty a few minutes later?

Or would it be better to at least shut off the datanode process in this
scenerio?

--Mike

Ted Dunning wrote:
> I would recommend taking the node down without decommissioning, replacing
> the disk, then bringing the node back up.  After 10-20 minutes the name node
> will figure things out and start replicating the missing blocks.
> Rebalancing would be a good idea to fill the new disk.  You could even do
> this with two nodes at a time, but I don't recommend that.
> 
> As soon as dfs shows no under replicated blocks, you can do the next disk.
> It could take some time for that to happen.
> 
> On Thu, Sep 10, 2009 at 8:06 PM, David B. Ritch <[email protected]>wrote:
> 
>> Thank you both.  That's what we did today.  It seems fairly reasonable
>> when a node has a few disks, say 3-5.  However, at some sites, with
>> larger nodes, it seems more awkward.  When a node has a dozen or more
>> disks (as used in the larger terasort benchmarks), migrating the data
>> off all the disks is likely to be more of an issue.  I hope that there
>> is a better solution to this before my client moves to much larger
>> nodes!  ;-)
>>
>> dbr
>>
>> On 9/10/2009 10:07 PM, Amandeep Khurana wrote:
>>> I think decommissioning the node and replacing the disk is a cleaner
>>> approach. That's what I'd recommend doing as well..
>>>
>>> On 9/10/09, Alex Loddengaard <[email protected]> wrote:
>>>
>>>> Hi David,
>>>> Unfortunately there's really no way to do what you're hoping to do in an
>>>> automatic way.  You can move the block files (including their .meta
>> files)
>>>> from one disk to another.  Do this when the datanode daemon is stopped.
>>>>  Then, when you start the datanode daemon, it will scan dfs.data.dir and
>> be
>>>> totally happy if blocks have moved hard drives.  I've never tried to do
>> this
>>>> myself, but others on the list have suggested this technique for
>> "balancing
>>>> disks."
>>>>
>>>> You could also change your process around a little.  It's not too crazy
>> to
>>>> decommission an entire node, replace one of its disks, then bring it
>> back
>>>> into the cluster.  Seems to me that this is a much saner approach: your
>> ops
>>>> team will tell you which disk needs replacing.  You decommission the
>> node,
>>>> they replace the disk, you add the node back to the pool.  Your call I
>>>> guess, though.
>>>>
>>>> Hope this was helpful.
>>>>
>>>> Alex
>>>>
>>>> On Thu, Sep 10, 2009 at 6:30 PM, David B. Ritch
>>>> <[email protected]>wrote:
>>>>
>>>>
>>>>> What do you do with the data on a failing disk when you replace it?
>>>>>
>>>>> Our support person comes in occasionally, and often replaces several
>>>>> disks when he does.  These are disks that have not yet failed, but
>>>>> firmware indicates that failure is imminent.  We need to be able to
>>>>> migrate our data off these disks before replacing them.  If we were
>>>>> replacing entire servers, we would decommission them - but we have 3
>>>>> data disks per server.  If we were replacing one disk at a time, we
>>>>> wouldn't worry about it (because of redundancy).  We can decommission
>>>>> the servers, but moving all the data off of all their disks is a waste.
>>>>>
>>>>> What's the best way to handle this?
>>>>>
>>>>> Thanks!
>>>>>
>>>>> David
>>>>>
>>>>>
>>>
>>
> 
> 

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to