Hi all, Since I got no replies to my previous message (see below), I went ahead and set the tcp_tw_recycle to true. This worked like a charm. The number of sockets in TIME_WAIT went down from many thousands to just a couple (tens). Apparently, once set to true, the recycling happens quite eagerly. Most importantly, the regionservers no longer shut down (which was the goal). I am sharing the info here, just in case it might help someone sometime.
Cheers, Friso On Jun 11, 2010, at 11:55 AM, Friso van Vollenhoven wrote: > Hi all, > We are experiencing a lot of "java.net.BindException: Cannot assign requested > address", which is a case of > https://issues.apache.org/jira/browse/hbase-2492. At some point, all grinds > to a halt and regionservers start to shut down. > > I was wondering if anyone has found a way around this problem (other than > adding more machines to spread the load or reduce the work load). Has anyone > been able to successfully apply the patch in > https://issues.apache.org/jira/browse/HDFS-941 to 0.20.2? Or does anyone have > experience with setting the /proc/sys/net/ipv4/tcp_tw_recycle to 1 (true) at > the OS level? > > We are running HBase 0.20.4-2524, r941433 and Hadoop 0.20.2. > > Any experiences that anyone can share are greatly appreciated. > > > Best regards, > Friso >
