[jira] [Created] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Lars Hofhansl (JIRA) Mon, 29 Aug 2011 15:30:05 -0700

HBaseAdmin never recovers from restarted cluster
------------------------------------------------


                 Key: HBASE-4283
                 URL: https://issues.apache.org/jira/browse/HBASE-4283
             Project: HBase
          Issue Type: Bug
            Reporter: Lars Hofhansl
            Priority: Minor


While testing common scenarios that we might encounter I found that HBaseAdmin 
does not recover from a restarted cluster.

It turns out HBaseClient.Connection.stop() is send into an endless loop here:
{code}
    // wait until all connections are closed
    while (!connections.isEmpty()) {
      try {
        Thread.sleep(100);
      } catch (InterruptedException ignored) {
      }
    }
{code}
The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence 
connections.isEmpty() is never true if there ever was any connection in there.
My fix is to remove the pool from the poolMap when it is empty. (Alternatively 
one could change PoolMap.isEmpty() to also look inside of all pools and see if 
their size is 0).


When I fixed that I noticed that if the master wasn't running when HBaseAdmin 
is created it also will not recover from that.
Even creating a new HBaseAdmin from the same Configuration will still use the 
old stale HConnection.

In that case a MasterNotRunningException is thrown, which is not handled in 
HBaseAdmin's constructor.

The HConnection handling in HConnectionManager is funky. There should never be 
a closed connection in the HBASE_INSTANCES.
I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Reply via email to