Samir Ahmic created HBASE-10310:
-----------------------------------

             Summary: ZNodeCleaner.java 
KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for 
/hbase/master
                 Key: HBASE-10310
                 URL: https://issues.apache.org/jira/browse/HBASE-10310
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.96.1.1
         Environment: x86_64 GNU/Linux
            Reporter: Samir Ahmic


I was testing "hbase master clear" command while working on [HBASE-7386] here 
is command and exception:
{code}
$ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear

14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=zk1:2181 sessionTimeout=90000 watcher=clean znode for master, 
quorum=zk1:2181, baseZNode=/hbase
14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process identifier=clean 
znode for master connecting to ZooKeeper ensemble=zk1:2181
14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
(Unable to locate a login configuration)
14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
zk11/172.17.33.5:2181, initiating session
14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete on 
server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated timeout 
= 40000
14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
ZooKeeper, quorum=zk1:2181, 
exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /hbase/master
14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
ZooKeeper, quorum=zk1:2181, 
exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
KeeperErrorCode = Session expired for /hbase/master
14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
failed after 1 attempts
14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get data 
of znode /hbase/master
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/master
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
        at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
        at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
        at 
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
        at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
        at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received unexpected 
KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/master
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
        at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
        at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
        at 
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
        at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
        at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
14/01/10 14:05:45 WARN zookeeper.ZooKeeperNodeTracker: Can't get or delete the 
master znode
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /hbase/master
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
        at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
        at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
        at 
org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
        at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
        at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
{code}

After checking ZNodeCleaner.java i notice this lines :
{code}
 try {
      znodeFileContent = ZNodeClearer.readMyEphemeralNodeOnDisk();
      
    } catch (FileNotFoundException fnfe) {
      // If no file, just keep going -- return success.
      LOG.warn("Can't find the znode file; presume non-fatal", fnfe);
      return true;
    } catch (IOException e) {
      LOG.warn("Can't read the content of the znode file", e);
      return false;
    } finally {
      zkw.close();
    }

    return MasterAddressTracker.deleteIfEquals(zkw, znodeFileContent);
  }
{code}
Looks like we are closing zookeeper connection prematurely. After moving
{code} return MasterAddressTracker.deleteIfEquals(zkw, znodeFileContent); 
{code} inside try block issue was fixed. 





--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to