bq. 2014-07-15 20:27:21,471 INFO [main-SendThread(localhost:2181)] master tried to connect to localhost.
Please take a look at http://hbase.apache.org/book.html#trouble.zookeeper On Tue, Jul 15, 2014 at 6:13 AM, psy <[email protected]> wrote: > Hi, everyone. I'm a student and I'm a beginner to HBase. This days I > meet a problem when I tried to run HBase in three machines. Hadoop run's > well, but when I start HBase, the "HMaster" in master node and > "HRegionServer" in slave nodes quit after a few seconds. In the master > node, jps is like this: > > hadoop@psyDebian:/opt$ jps > 5416 NameNode > 5647 SecondaryNameNode > 5505 DataNode > 398 Jps > 32745 HMaster > 32670 HQuorumPeer > > and just for a while, it is like this: > > hadoop@psyDebian:/opt$ jps > 5416 NameNode > 5647 SecondaryNameNode > 5505 DataNode > 423 Jps > 32670 HQuorumPeer > > the master log: > > hadoop@psyDebian:/opt$ tail -n > 30 /opt/hbase/logs/hbase-hadoop-master-psyDebian.log > 2014-07-15 20:27:21,470 INFO [main-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Opening socket connection to server > localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL > (unknown error) > 2014-07-15 20:27:21,471 INFO [main-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Socket connection established to > localhost/127.0.0.1:2181, initiating session > 2014-07-15 20:27:21,471 INFO [main-SendThread(localhost:2181)] > zookeeper.ClientCnxn: Unable to read additional data from server > sessionid 0x0, likely server has closed socket, closing socket > connection and attempting reconnect > 2014-07-15 20:27:21,572 WARN [main] zookeeper.RecoverableZooKeeper: > Possibly transient ZooKeeper, > quorum=centos1:2181,psyDebian:2181,centos2:2181, > exception=org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /hbase > 2014-07-15 20:27:21,572 ERROR [main] zookeeper.RecoverableZooKeeper: > ZooKeeper create failed after 4 attempts > 2014-07-15 20:27:21,572 ERROR [main] master.HMasterCommandLine: Master > exiting > java.lang.RuntimeException: Failed construction of Master: class > org.apache.hadoop.hbase.master.HMaster > at > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2789) > at > > org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:186) > at > > org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:135) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at > > org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126) > at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2803) > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for /hbase > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) > at > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:489) > at > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:468) > at > > org.apache.hadoop.hbase.zookeeper.ZKUtil.createWithParents(ZKUtil.java:1241) > at > > org.apache.hadoop.hbase.zookeeper.ZKUtil.createWithParents(ZKUtil.java:1219) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:174) > at > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:167) > at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:481) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:534) > at > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2784) > > > > the "out" log: > hadoop@psyDebian:/opt$ > tail /opt/hbase/logs/hbase-hadoop-master-psyDebian.out > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > > [jar:file:/opt/hbase/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > > [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > > > the "zookeeper" log: > 2014-07-15 20:48:20,572 INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181] > quorum.FollowerZooKeeperServer: Shutting down > 2014-07-15 20:48:20,573 INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181] > server.ZooKeeperServer: shutting down > 2014-07-15 20:48:20,573 INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181] > quorum.QuorumPeer: LOOKING > 2014-07-15 20:48:20,574 INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181] > quorum.QuorumPeer: acceptedEpoch not found! Creating with a reasonable > default of 0. This should only happen when you are upgrading your > installation > 2014-07-15 20:48:20,625 INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181] > quorum.FastLeaderElection: New election. My id = 0, proposed zxid=0x0 > 2014-07-15 20:48:20,626 INFO [WorkerReceiver[myid=0]] > quorum.FastLeaderElection: Notification: 0 (n.leader), 0x0 (n.zxid), > 0x57 (n.round), LOOKING (n.state), 0 (n.sid), 0x0 (n.peerEPoch), LOOKING > (my state) > 2014-07-15 20:48:20,627 INFO [WorkerReceiver[myid=0]] > quorum.FastLeaderElection: Notification: 2 (n.leader), 0x0 (n.zxid), > 0x55 (n.round), LEADING (n.state), 2 (n.sid), 0x0 (n.peerEPoch), LOOKING > (my state) > 2014-07-15 20:48:20,627 INFO [WorkerReceiver[myid=0]] > quorum.FastLeaderElection: Notification: 1 (n.leader), 0x0 (n.zxid), > 0x56 (n.round), LEADING (n.state), 1 (n.sid), 0x0 (n.peerEPoch), LOOKING > (my state) > 2014-07-15 20:48:20,827 INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181] > quorum.FastLeaderElection: Notification time out: 400 > 2014-07-15 20:48:20,827 INFO [WorkerReceiver[myid=0]] > quorum.FastLeaderElection: Notification: 0 (n.leader), 0x0 (n.zxid), > 0x57 (n.round), LOOKING (n.state), 0 (n.sid), 0x0 (n.peerEPoch), LOOKING > (my state) > 2014-07-15 20:48:20,828 INFO [WorkerReceiver[myid=0]] > quorum.FastLeaderElection: Notification: 2 (n.leader), 0x0 (n.zxid), > 0x55 (n.round), LEADING (n.state), 2 (n.sid), 0x0 (n.peerEPoch), LOOKING > (my state) > 2014-07-15 20:48:20,828 INFO [WorkerReceiver[myid=0]] > quorum.FastLeaderElection: Notification: 1 (n.leader), 0x0 (n.zxid), > 0x56 (n.round), LEADING (n.state), 1 (n.sid), 0x0 (n.peerEPoch), LOOKING > (my state) > 2014-07-15 20:48:21,229 INFO [QuorumPeer[myid=0]/0:0:0:0:0:0:0:0:2181] > quorum.FastLeaderElection: Notification time out: 800 > > > These are my configuration files: > > core-site.xml: > <configuration> > <property> > <name>fs.default.name</name> > <value>hdfs://psyDebian:9000</value> > </property> > > <property> > <name>hadoop.tmp.dir</name> > <value>/home/hadoop/hadoop_tmp</value> > </property> > </configuration> > > hdfs-site.xml: > <configuration> > <property> > <name>dfs.datanode.data.dir</name> > <value>/home/hadoop/hadoop_tmp/dfs/data</value> > </property> > > <property> > <name>dfs.namenode.name.dir</name> > <value>/home/hadoop/hadoop_tmp/dfs/name</value> > </property> > > <property> > <name>dfs.replication</name> > <value>3</value> > </property> > </configuration> > > hbase-site.xml: > <configuration> > <property> > <name>hbase.rootdir</name> > <value>hdfs://psyDebian:9000/hbase</value> > </property> > > <property> > <name>hbase.cluster.distributed</name> > <value>true</value> > </property> > > <property> > <name>hbase.master</name> > <value>psyDebian:60000</value> > </property> > > <property> > <name>hbase.zookeeper.quorum</name> > <value>psyDebian,centos1,centos2</value> > </property> > > <property> > <name>hbase.zookeeper.property.dataDir</name> > <value>/home/hadoop/zookeeper_tmp</value> > </property> > > <property> > <name>zookeeper.session.timeout</name> > <value>90000</value> > </property> > > <property> > <name>hbase.reginserver.restart.on.zk.expire</name> > <value>true</value> > </property> > </configuration> > > > > The master node is Debian 7.5, and two slaves are both centos 6.5. > Hadoop is 2.2.0 and Hbase is 0.98.3. The time of three machines are > synchronized and firewalls(iptables) are closed. Java's version is > java-1.6.0-openjdk. I'm not very familiar with HBase so I can't > understand the ERRORS from the logs, and I didn't get any useful > information from the Internet these days. could you help me? or tells me > what should I do to find out the reason of this problem? > Thank you so much. > > >
