By setting "dfs.balance.bandwidthPerSec" to 1GB/sec, each datanode is able
to utilize up to 1GB/sec for block balancing. It seems to be too high as
even a gigabit ethernet can't handle that much data per sec.

When you get timeouts, it probably means your network is saturated. Maybe
you were running a big map reduce job which required lots of data transfer
among nodes by then?

Try setting it to be 10~30MB/sec and see what happens.

On Sat, Jul 19, 2008 at 1:56 AM, David J. O'Dell <[EMAIL PROTECTED]>
wrote:

> I'm trying to re balance my cluster as I've added to more nodes.
> When I run balancer with the default threshold I am seeing timeouts in
> the logs:
>
> 2008-07-18 09:50:46,636 INFO org.apache.hadoop.dfs.Balancer: Decided to
> move block -8432927406854991437 with a length of 128 MB bytes from
> 10.11.6.234:50010 to 10.11.6.235:50010 using proxy source
> 10.11.6.234:50010
> 2008-07-18 09:50:46,636 INFO org.apache.hadoop.dfs.Balancer: Starting
> Block mover for -8432927406854991437 from 10.11.6.234:50010 to
> 10.11.6.235:50010
> 2008-07-18 09:52:46,826 WARN org.apache.hadoop.dfs.Balancer: Timeout
> moving block -8432927406854991437 from 10.11.6.234:50010 to
> 10.11.6.235:50010 through 10.11.6.234:50010
>
> I read in the balancer guide->
> http://issues.apache.org/jira/secure/attachment/12370966/BalancerUserGuide2
> That the default transfer rate is 1mb/sec
> I tried increasing this to 1gb/sec but I'm still seeing the timeouts.
> All of the nodes have gigE nics and are on the same switch.
>
>
> --
> David O'Dell
> Director, Operations
> e: [EMAIL PROTECTED]
> t:  (415) 738-5152
> 180 Townsend St., Third Floor
> San Francisco, CA 94107
>
>

Reply via email to