Hi, First of all I'm fairly new to HBase and have set up a small deployment of Hadoop and HBase (0.20.4) on two servers for the beginning in a fully distributed mode. HBase works fine on one server (client operations work perfectly), however starting the second RegionServer throws exceptions which I couldn't resolve. Servers run in a virtualized environment and I want to extend the deployment size as soon as possible to do some benchmarking for my particular purposes.
I anonymized my server names a little bit and therefore you will encounter namings like xyz or xxx. My first suspect was /etc/hosts as I don't use a DNS server and Ubuntu adds the IP 127.0.1.1 to localhost by default (which I removed). I replicated the name resolution across the two servers: ***** /etc/hosts on the master and region1: ***** 127.0.0.1 localhost 9.2.18.168 master.x.y.z master 9.2.18.163 region1.x.y.z region1 ***** I get following exceptions on my region1 server: ***** 2010-07-06 04:44:17,896 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.32-22-generic 2010-07-06 04:44:17,896 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=xxx 2010-07-06 04:44:17,896 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/home/xxx 2010-07-06 04:44:17,896 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/home/xxx/hbase-0.20.4 2010-07-06 04:44:17,897 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=60000 watcher=org.apache.hadoop.hbase.regionserver.hregionser...@555c07d8 2010-07-06 04:44:17,899 INFO org.apache.zookeeper.ClientCnxn: zookeeper.disableAutoWatchReset is false 2010-07-06 04:44:20,650 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server localhost/127.0.0.1:2181 2010-07-06 04:44:20,656 WARN org.apache.zookeeper.ClientCnxn: Exception closing session 0x0 to sun.nio.ch.selectionkeyi...@24148662 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:933) 2010-07-06 04:44:20,658 WARN org.apache.zookeeper.ClientCnxn: Ignoring exception during shutdown input java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:656) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:378) at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) 2010-07-06 04:44:20,658 WARN org.apache.zookeeper.ClientCnxn: Ignoring exception during shutdown output java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:667) at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:386) at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970) 2010-07-06 04:44:20,777 WARN org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Failed to set watcher on ZNode /hbase/master org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:780) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper.watchMasterAddress(ZooKeeperWrapper.java:366) at org.apache.hadoop.hbase.regionserver.HRegionServer.watchMasterAddress(HRegionServer.java:389) at org.apache.hadoop.hbase.regionserver.HRegionServer.reinitializeZooKeeper(HRegionServer.java:319) at org.apache.hadoop.hbase.regionserver.HRegionServer.reinitialize(HRegionServer.java:310) at org.apache.hadoop.hbase.regionserver.HRegionServer.<init>(HRegionServer.java:280) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:532) at org.apache.hadoop.hbase.regionserver.HRegionServer.doMain(HRegionServer.java:2443) at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2511) ***** My hbase-site.xml looks as follows ***** <configuration> <property> <name>fs.default.name</name> <value>hdfs://master:54310</value> </property> <property> <name>mapred.job.tracker</name> <value>master:54311</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.datanode.max.xcievers</name> <value>2047</value> </property> <property> <name>hbase.rootdir</name> <value>hdfs://master:54310/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> </configuration> ***** regionservers ***** master region1 ***** Zookeeper Dump looks as follows ***** hbase(main):003:0> zk_dump HBase tree in ZooKeeper is rooted at /hbase Cluster up? true In safe mode? false Master address: 9.2.18.168:60000 Region server holding ROOT: 9.2.18.168:60020 Region servers: - 9.2.18.168:60020 Quorum Server Statistics: - localhost:2181 Zookeeper version: 3.2.2-888565, built on 12/08/2009 21:51 GMT Clients: /127.0.0.1:38162[1](queued=0,recved=10,sent=0) /127.0.0.1:55368[1](queued=0,recved=29,sent=0) /127.0.0.1:54149[1](queued=0,recved=255,sent=0) /127.0.0.1:38164[1](queued=0,recved=0,sent=0) Latency min/avg/max: 0/5/1192 Received: 294 Sent: 0 Outstanding: 0 Zxid: 0xb Mode: standalone Node count: 11 Thanks in advance!! /Samuru
