I assume you passed true as second parameter to deleteConnection(). On Wed, Mar 23, 2011 at 1:54 PM, Dmitriy Lyubimov <[email protected]> wrote:
> Hi, > > I am experiencing severe connection leak in my MR client that uses > Hbase as input/output . Every job that uses TableInputFormat leaks 1 > zookeeper connection per run as evidenced by netstat. > > I understand that the way HTable manages connections now is it creates > a new HBase (and also Zookeeper) connection per each instance of > Configuration it is initialized with. By looking at the code of the > TableInputFormat class, i see that it creates HTable in the front end > during configuration (of course, it probably needs to use it to > determine region splits). > > Since i have to configure each job individually, i must create a new > instance of Configuration. Thus, i am not able to use shared HBase > connections (which i would prefer to, but there seems to be no way now > to do that). > > So... after i run an instance of MR job, the hbase connection seems to > be leaked. It also leaks zk connection , which is a problem since > zookeeper instances have limits on how many connections can be made > from the same IP and eventually the client is not able to create any > new HTables anymore since it can't establish any new zookeeper > connections. > > I tried to do explicit cleanup by calling > HConnectionManager.deleteConnection (Configuration) passing in the > configuration that i used to create MR job. Doesn't seem to work. > > So.. Is there a way to run MR job with TableInputFormat without > leaking a connection? I am pretty sure i am not creating any HTables > in the client side. Or is it a bug? I spent several days now > investigation an issue but i am still not able to come up with a > workaround against zookeeper connection leaks in HBase MR jobs. > > thank you very much. > -Dmitriy >
