This problem has come up a few times. There are leaked connections in the TIF
See: https://issues.apache.org/jira/browse/HBASE-3792 https://issues.apache.org/jira/browse/HBASE-3777 A quick and (very) dirty solution is to call deleteAllConnections(bool) at the end of your MapReduce jobs, or periodically. If you have no other tables or pools, etc. open, then no problem. If you do, they'll start throwing IOExceptions, but you can re-instantiate them with a new config and then continue as usual. (You do have to change the config or it'll simply grab the closed, cached one from the HCM). Another way: The leak comes from inside of TableInputFormat.setConf, where the Configuration gets cloned (so then it's hash in the HCM is lost): setHTable(new HTable(new Configuration(conf), tableName)); This is done to prevent changes to a config from affecting the job and vice-versa. If you're 100% sure the config won't be modified, you could subclass TIF to not make this copy. For me, I didn't have extra tables hanging around, so I just blast them with deleteAllConnections. :) - Ruben ________________________________ From: Jeff Whiting <[email protected]> To: [email protected] Sent: Thu, July 28, 2011 12:10:16 PM Subject: Re: HBase & MapReduce & Zookeeper 10 connection maximum is too low. It has been recommended to go up to as many as 2000 connections in the list. This doesn't fix your problem but is something you should probably have in your configuration. ~Jeff On 7/28/2011 10:00 AM, Stack wrote: > Try getting the ZooKeeperWatcher from the connection on your way out > and explicitly shutdown the zk connection (see TestZooKeeper unit test > for example). > St.Ack > > On Thu, Jul 28, 2011 at 6:01 AM, Andre Reiter<[email protected]> wrote: >> this issue is still not resolved... >> >> unfortunatelly calling HConnectionManager.deleteConnection(conf, true); >> after the MR job is finished, does not close the connection to the zookeeper >> we have 3 zookeeper nodes >> by default there is a limit of 10 connections allowed from a single client >> >> so after running 30 MR jobs scheduled by our application, we have 30 >> unclosed connections, trying to start a new MR job results in a failure, the >> connection to the zookeeper ensamble is droped... >> >> the work around to restart the whole application after 30 MR jobss is not >> very elegant... :-( >> >> >> -- Jeff Whiting Qualtrics Senior Software Engineer [email protected]
