Re: Under replicated block doesn't get fixed until DFS restart

Ted Dunning Fri, 04 Jan 2008 13:24:20 -0800

It can take a long time to decide that a node is down.  If that down node
has the last copy of a file, then it won't get replicated.

I run a balancing script every few hours.  It wanders through the files and
ups the replication of each file temporarily.  This is important because
initial allocations of blocks isn't done as well as increased allocations.
It also causes the system to respond sooner to low replication count files
... if a datanode is down, then the remaining nodes will respond to the
increased replication count and the down node won't respond to requests to
delete the block.  This results in a desirable improvement in replication
for those nearly orphaned blocks.

On 1/4/08 1:02 PM, "Raghu Angadi" <[EMAIL PROTECTED]> wrote:

> This is of course not expected. A more detailed info or log message
> would help. Do you know if there is at least one good block? Sometimes,
> the remaining "good" block might actually be corrupted and thus can not
> replicate itself. Restarting might just have brought up the datanodes
> that were down (for whatever reason) before the restart.
> 
> Raghu.
> 
> Chris Kline wrote:
>> fsck reports several under replicated blocks, but these do not get fixed
>> until I restart DFS.  fsck also reports a missing block at the same
>> time, but this should affect the function of fixing under replicated
>> blocks.  Has anyone seen this before?
>> 
>> I'm running 0.15.0.
>> 
>> -Chris Kline
>

Re: Under replicated block doesn't get fixed until DFS restart

Reply via email to