Can you check corresponding region server to see if the server was operating correctly ?
I went over some previous threads where some region server was using wrong zookeeper quorum. Cheers On Thu, Jul 16, 2015 at 7:35 AM, dgoldenberg123 <[email protected]> wrote: > Could someone elaborate on what this error means? > > We run into a periodic shutdown of HBase (0.98.9 built for Hadoop 2) while > inserting records into it under load and the stack trace below appears to > be > reflective of the cause. > > Looking at HMaster.java, what does this error imply and are there ways to > fix it or work around it? > > private boolean abortNow(final String msg, final Throwable t) { > if (!this.isActiveMaster) { > return true; > } > if (t != null && t instanceof KeeperException.SessionExpiredException) > { > try { > LOG.info("Primary Master trying to recover from ZooKeeper session " > + > "expiry."); > return !tryRecoveringExpiredZKSession(); > } catch (Throwable newT) { > LOG.error("Primary master encountered unexpected exception while " > + > "trying to recover from ZooKeeper session" + > " expiry. Proceeding with server abort.", newT); > } > } > return true; > } > > > Is https://issues.apache.org/jira/browse/HBASE-4479 related at all (marked > fixed as of 0.92.0)? > > Any insight would be greatly appreciated. > > ERROR main-EventThread master.HMaster: Primary master encountered > unexpected > exception while trying to recover from ZooKeeper session expiry. Proceeding > with server abort. > java.util.concurrent.ExecutionException: java.io.IOException: error or > interrupted while splitting logs in > hdfs:// > acme-server.com:9000/tmp/hbase-root/hbase/WALs/acme-server,60088,1436822380393-splitting > Task = installed > = 1 done = 0 error = 1 > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > > org.apache.hadoop.hbase.master.HMaster.tryRecoveringExpiredZKSession(HMaster.java:2498) > at org.apache.hadoop.hbase.master.HMaster.abortNow(HMaster.java:2526) > at org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:2431) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:403) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:321) > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > Caused by: java.io.IOException: error or interrupted while splitting logs > in > hdfs:// > acme-server.acme.com:9000/tmp/hbase-root/hbase/WALs/acme-server,60088,1436822380393-splitting > Task = installed = 1 done = 0 error = 1 > at > > org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:359) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:416) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:308) > at > > org.apache.hadoop.hbase.master.MasterFileSystem.splitMetaLog(MasterFileSystem.java:299) > at > > org.apache.hadoop.hbase.master.HMaster.splitMetaLogBeforeAssignment(HMaster.java:1178) > at org.apache.hadoop.hbase.master.HMaster.assignMeta(HMaster.java:1113) > at > > org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:978) > at org.apache.hadoop.hbase.master.HMaster.access$300(HMaster.java:286) > at org.apache.hadoop.hbase.master.HMaster$3.call(HMaster.java:2482) > at org.apache.hadoop.hbase.master.HMaster$3.call(HMaster.java:2470) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2015-07-14 17:51:54,433 FATAL main-EventThread master.HMaster: > > master:57118-0x14e89499bbd0000-0x14e89499bbd0000-0x14e89499bbd0000-0x14e89499bbd0000, > quorum=localhost:2181, baseZNode=/hbase master:57118-0x14e89499bbd0000- > 0x14e89499bbd0000-0x14e89499bbd0000-0x14e89499bbd0000 received expired from > ZooKeeper, aborting > org.apache.zookeeper.KeeperException$SessionExpiredException: > KeeperErrorCode = Session expired > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:403) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:321) > at > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) > 2015-07-14 17:51:54,433 INFO main-EventThread master.HMaster: Aborting > 2015-07-14 17:51:54,433 INFO main-EventThread zookeeper.ClientCnxn: > EventThread shut down > 2015-07-14 17:51:54,434 INFO acme-server,57118,1436822379834-BalancerChore > balancer.BalancerChore: acme-server,57118,1436822379834-BalancerChore > exiting > 2015-07-14 17:51:54,435 INFO > acme-server,57118,1436822379834-ClusterStatusChore > balancer.ClusterStatusChore: > acme-server,57118,1436822379834-ClusterStatusChore exiting > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/Error-Primary-master-encountered-unexpected-exception-while-trying-to-recover-from-ZooKeeper-session-tp4073279.html > Sent from the HBase User mailing list archive at Nabble.com. >
