Thanks Alex.

From: Alex Loddengaard [mailto:a...@cloudera.com]
Sent: Thursday, July 08, 2010 11:39 AM
To: hdfs-user@hadoop.apache.org
Subject: Re: rebalancing replciation help

Hi Arun,

Consider setting dfs.balance.bandwidthPerSec to something as high as 20971520 
for the balancer and the setrep.  You can do this by supplying -D at the 
command line.

Your strategy for getting data onto the 5 nodes is correct: balance and setrep. 
 Just understand these things take time.

Hope this helps.

Alex
On Wed, Jul 7, 2010 at 4:09 PM, Arun Ramakrishnan 
<aramakrish...@languageweaver.com<mailto:aramakrish...@languageweaver.com>> 
wrote:
Hi guys.
  I have more than a specific question. I am going to layout the steps I have 
taken. Please comment on what I can do better.

  I was trying to to add 5 nodes to my existing 10 node cluster and also 
increase the replication factor from 2 to 3.
I thought I don't have to run the balancer cause it would most likely put the 
new replicas into the new nodes.

There are about 500k blocks.
I wanted to get it all stabilized(replication and balancing) within 24 hours. 
Its more than 24 hours now and fsck reports 30% under replication. Is there a 
way to force hdfs to use balance/replicate more aggressively.

It would be great if someone explained what/when things happen to blocks in the 
context of

1)      Rebalancing

2)      -setrep

3)      Restarting cluster with a higher/lower replication factor.

A few questions and a few issues here.

1)      When you restart the cluster with a higher than previous replication 
value. Does it also apply to existing blocks or only to new blocks being 
created ?

2)      Does the balancer take into account under replication of blocks or does 
it blindly start moving existing blocks to reach threshold ?


A very specific problem .  I am having this strange problem where the -setrep 
hangs on one particular block for hours. Is this because its corrupt ?. But, 
fsck said its healthy.


Thanks
Arun

Reply via email to