Re: Problem: Some blocks remain under replicated

ilayaraja Tue, 31 Mar 2009 07:33:16 -0700

We are using hadoop-0.15.
Let me explain the scenario:

We have around 6 TB of data in our cluster on couple of datadirectories(/mnt, /mnt2)with a replication factor of 1. when we increasedthe replication to 2 for the entire data, we observed that /mnt is used 100%while /mnt2 is under utilized. So we wanted to balance the utilization ofspace from both the data directories by changing the hadoop code forgetNextVolume(..) API. The new algorithm checks which data directory ishaving more space available and returns that as the volume for the block tobe written. This updated version of hadoop is then used for setting thereplication to 2 for the entire dfs. However, when the dfs replication isover, it reported more than 100 GB of data blocks are missing as well assome blocks are under replicated. We also observed that there are manyblocks of size zero present in the cluster, we do not know how these blockswere created.

----- Original Message -----From: "Hairong Kuang" <[email protected]>To: "hadoop-dev" <[email protected]>; "ilayaraja"<[email protected]>; "hadoop-user" <[email protected]>

Sent: Tuesday, March 31, 2009 3:30 AM
Subject: Re: Problem: Some blocks remain under replicated

Which version of HADOOP are you running? Your cluster might have hit
HADOOP-5465.

Hairong


On 3/29/09 10:24 PM, "ilayaraja" <[email protected]> wrote:

Hello !
I am trying to increase the replication factor of a directory in ourhadoop
dfs from 1 to 2.
I observe that some of the blocks (12 out of 400) always remain under
replicated, throwing the following message when I do an 'fsck' :
Under replicated blk_9084408236031628003. TargetReplicas
is 2 but found 1 replica(s).

I thought it could be a problem with a specific data node in the cluster,
however I observe that the under replicated blocks belong to differentdata
nodes .

Please give me your thoughts.

Thanks.
Ilay

Re: Problem: Some blocks remain under replicated

Reply via email to