You are correct. The default 1mb/sec is too low. 1gb/sec is too high. I changed it to 10mb/sec and its humming along. Thanks.
Taeho Kang wrote: > By setting "dfs.balance.bandwidthPerSec" to 1GB/sec, each datanode is able > to utilize up to 1GB/sec for block balancing. It seems to be too high as > even a gigabit ethernet can't handle that much data per sec. > > When you get timeouts, it probably means your network is saturated. Maybe > you were running a big map reduce job which required lots of data transfer > among nodes by then? > > Try setting it to be 10~30MB/sec and see what happens. > > On Sat, Jul 19, 2008 at 1:56 AM, David J. O'Dell <[EMAIL PROTECTED]> > wrote: > > >> I'm trying to re balance my cluster as I've added to more nodes. >> When I run balancer with the default threshold I am seeing timeouts in >> the logs: >> >> 2008-07-18 09:50:46,636 INFO org.apache.hadoop.dfs.Balancer: Decided to >> move block -8432927406854991437 with a length of 128 MB bytes from >> 10.11.6.234:50010 to 10.11.6.235:50010 using proxy source >> 10.11.6.234:50010 >> 2008-07-18 09:50:46,636 INFO org.apache.hadoop.dfs.Balancer: Starting >> Block mover for -8432927406854991437 from 10.11.6.234:50010 to >> 10.11.6.235:50010 >> 2008-07-18 09:52:46,826 WARN org.apache.hadoop.dfs.Balancer: Timeout >> moving block -8432927406854991437 from 10.11.6.234:50010 to >> 10.11.6.235:50010 through 10.11.6.234:50010 >> >> I read in the balancer guide-> >> http://issues.apache.org/jira/secure/attachment/12370966/BalancerUserGuide2 >> That the default transfer rate is 1mb/sec >> I tried increasing this to 1gb/sec but I'm still seeing the timeouts. >> All of the nodes have gigE nics and are on the same switch. >> >> >> -- >> David O'Dell >> Director, Operations >> e: [EMAIL PROTECTED] >> t: (415) 738-5152 >> 180 Townsend St., Third Floor >> San Francisco, CA 94107 >> >> >> -- David O'Dell Director, Operations e: [EMAIL PROTECTED] t: (415) 738-5152 180 Townsend St., Third Floor San Francisco, CA 94107
