Hi,
I have the following line in /etc/hosts in all servers, should I keep it or comment it out or ...? 127.0.0.1 localhost Please help. Thanks On 21 Nov 2012, at 7:16 PM, [email protected] wrote: > Hi, > > > Please help!! > > HBase version: 0.94 > ZooKeeper: 3.4.4 > > One of the regional servers stopped very quickly after HBASE is started: > > ### Check JPS after HBASE cluster was started, could find the HRegionServer > process (*** there is no any ZooKeeper instance running in this server ***) > $ jps > 24767 Jps > 18418 TaskTracker > 24678 HRegionServer > 18156 DataNode > > ### Wait a while and checked JPS again, HRegionServer process gone > $ jps > 18418 TaskTracker > 24784 Jps > 18156 DataNode > > > ### Here is the setting in hbase-site.xml ( enabled > hbase.cluster.distributed, set up 3 ZooKeepers, timeout= 60000) > <property> > <name>hbase.cluster.distributed</name> > <value>true</value> > </property> > > <property> > <name>hbase.ZooKeeper.quorum</name> > <value>m146,m145,m143</value> > </property> > > <property> > <name>zookeeper.session.timeout</name> > <value>60000</value> > </property> > > > ### hbase-env.sh also tells HBASE not to manage local instance of ZooKeeper > export HBASE_MANAGES_ZK=false > > > ###This server can connect to the 3 ZooKeepers, > ./zkCli.sh -server m145,m146,m143 ==> [zk: m145,m146,m143(CONNECTED) 0] > > > ### checked the hbase log file, found something odd, seemed that it tried to > connect local ZooKeeper > 2012-11-21 17:30:33,066 INFO org.apache.zookeeper.ZooKeeper: Initiating > client connection, connectString=localhost:2181 sessionTimeout=60000 > watcher=regionserver:60020 > > 2012-11-21 17:31:33,254 WARN > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient > ZooKeeper exception: > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for /hbase/master > > 2012-11-21 17:31:33,254 INFO org.apache.hadoop.hbase.util.RetryCounter: > Sleeping 2000ms before retry #1... > 2012-11-21 17:32:33,262 INFO org.apache.zookeeper.ClientCnxn: Client session > timed out, have not heard from server in 60010ms for sessionid 0x0, closing > socket connection and attempting reconnect > > 2012-11-21 17:32:33,362 WARN > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Possibly transient > ZooKeeper exception: > org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode > = ConnectionLoss for /hbase/master > > ...... > > 2012-11-21 17:34:33,570 ERROR > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: ZooKeeper exists > failed after 3 retries > 2012-11-21 17:34:33,571 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: > regionserver:60020 Unable to set watcher on znode /hbase/master > 2012-11-21 17:34:33,573 ERROR > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: regionserver:60020 > Received unexpected KeeperException, re-throwing exception > 2012-11-21 17:34:33,573 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > ...... > 2012-11-21 17:34:33,576 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: > loaded coprocessors are: [] > > 2012-11-21 17:34:36,580 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > m144,60020,1353490232962: Initialization of RS failed. Hence aborting RS. > java.io.IOException: Received the shutdown message while waiting. > at > org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:623) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:598) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:560) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:669) > at java.lang.Thread.run(Thread.java:662) > 2012-11-21 17:34:36,581 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: RegionServer abort: > loaded coprocessors are: [] > > > Please help! > QUESTION: Is it a bug and I need to check something else? > > Thanks > > > > > >
