I'm not that familiar with ZK, but I have it set to:
# Tell HBase whether it should manage it's own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=true
and
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
<description>The mode the cluster will be in. Possible values are
false: standalone and pseudo-distributed setups with managed Zookeeper
true: fully-distributed with unmanaged Zookeeper Quorum (see
hbase-env.sh)
</description>
</property>
<property>
I'll look at the diagnostic tools. Thanks,
Mike
On Thu, Jan 20, 2011 at 5:09 PM, Ted Dunning <[email protected]> wrote:
> Are you running a standalone ZK cluster?
>
> If so, are you doing the standard ZK diagnostic things?
>
> ZK supports several simple checks which you can access by sending special
> four byte sequences to the standard client port.
>
> See
>
> http://hadoop.apache.org/zookeeper/docs/r3.2.0/zookeeperAdmin.html#sc_zkCommands
>
> <
> http://hadoop.apache.org/zookeeper/docs/r3.2.0/zookeeperAdmin.html#sc_zkCommands
> >(but
> watch out with the echo idiom ... some versions of ZK or nc may close the
> connection too quickly to get teh reply)
>
>
>
> On Thu, Jan 20, 2011 at 1:52 PM, mike anderson <[email protected]
> >wrote:
>
> > Alright I'm stumped here. It looks like Zookeeper isn't playing nice and
> > causing the master to not start. Any suggestions?
> >
> >
> > I'm seeing a lot of these: 2011-01-20 16:47:55,744 INFO
> > org.apache.zookeeper.ClientCnxn: Unable to read additional data from
> server
> > sessionid 0x0, likely server has closed socket, closing socket connection
> > and attempting reconnect
> >
> > Which looks bad, full log below.
> >
> > -mike
> >
> >
> >
> >
> > Thu Jan 20 16:47:47 EST 2011 Starting master on carr.host.com
> > ulimit -n 20480
> > 2011-01-20 16:47:47,571 INFO org.apache.hadoop.hbase.ipc.HBaseRpcMetrics:
> > Initializing RPC Metrics with hostName=HMaster, port=60000
> > 2011-01-20 16:47:47,628 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > Responder: starting
> > 2011-01-20 16:47:47,630 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > listener on 60000: starting
> > 2011-01-20 16:47:47,630 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 0 on 60000: starting
> > 2011-01-20 16:47:47,631 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 1 on 60000: starting
> > 2011-01-20 16:47:47,631 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 2 on 60000: starting
> > 2011-01-20 16:47:47,631 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 3 on 60000: starting
> > 2011-01-20 16:47:47,631 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 4 on 60000: starting
> > 2011-01-20 16:47:47,631 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 5 on 60000: starting
> > 2011-01-20 16:47:47,631 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 6 on 60000: starting
> > 2011-01-20 16:47:47,631 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 7 on 60000: starting
> > 2011-01-20 16:47:47,632 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 8 on 60000: starting
> > 2011-01-20 16:47:47,633 INFO org.apache.hadoop.ipc.HBaseServer: IPC
> Server
> > handler 9 on 60000: starting
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32
> GMT
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:host.name=carr.host.com
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:java.version=1.6.0_07
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:java.vendor=Sun Microsystems Inc.
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:java.home=/usr/java/jre1.6.0_07
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> >
> >
> environment:java.class.path=/usr/local/hbase/bin/../conf:/usr/java/jre1.6.0_07/lib/tools.jar:/usr/local/hbase/bin/..:/usr/local/hbase/bin/../hbase-0.90.0.jar:/usr/local/hbase/bin/../hbase-0.90.0-tests.jar:/usr/local/hbase/bin/../lib/activation-1.1.jar:/usr/local/hbase/bin/../lib/ant-1.6.5.jar:/usr/local/hbase/bin/../lib/ant-1.7.1.jar:/usr/local/hbase/bin/../lib/ant-launcher-1.7.1.jar:/usr/local/hbase/bin/../lib/asm-3.1.jar:/usr/local/hbase/bin/../lib/avro-1.3.3.jar:/usr/local/hbase/bin/../lib/commons-cli-1.2.jar:/usr/local/hbase/bin/../lib/commons-codec-1.4.jar:/usr/local/hbase/bin/../lib/commons-el-1.0.jar:/usr/local/hbase/bin/../lib/commons-httpclient-3.1.jar:/usr/local/hbase/bin/../lib/commons-lang-2.5.jar:/usr/local/hbase/bin/../lib/commons-logging-1.1.1.jar:/usr/local/hbase/bin/../lib/commons-net-1.4.1.jar:/usr/local/hbase/bin/../lib/core-3.1.1.jar:/usr/local/hbase/bin/../lib/guava-r06.jar:/usr/local/hbase/bin/../lib/hadoop-0.20.2-core.jar:/usr/local/hbase/bin/../lib/hbase-0.90.0.jar:/usr/local/hbase/bin/../lib/hsqldb-1.8.0.10.jar:/usr/local/hbase/bin/../lib/jackson-core-asl-1.5.5.jar:/usr/local/hbase/bin/../lib/jackson-jaxrs-1.5.5.jar:/usr/local/hbase/bin/../lib/jackson-mapper-asl-1.4.2.jar:/usr/local/hbase/bin/../lib/jackson-xc-1.5.5.jar:/usr/local/hbase/bin/../lib/jasper-compiler-5.5.23.jar:/usr/local/hbase/bin/../lib/jasper-runtime-5.5.23.jar:/usr/local/hbase/bin/../lib/jaxb-api-2.1.jar:/usr/local/hbase/bin/../lib/jaxb-impl-2.1.12.jar:/usr/local/hbase/bin/../lib/jersey-core-1.4.jar:/usr/local/hbase/bin/../lib/jersey-json-1.4.jar:/usr/local/hbase/bin/../lib/jersey-server-1.4.jar:/usr/local/hbase/bin/../lib/jets3t-0.7.1.jar:/usr/local/hbase/bin/../lib/jettison-1.1.jar:/usr/local/hbase/bin/../lib/jetty-6.1.26.jar:/usr/local/hbase/bin/../lib/jetty-util-6.1.26.jar:/usr/local/hbase/bin/../lib/jline-0.9.94.jar:/usr/local/hbase/bin/../lib/jruby-complete-1.0.3.jar:/usr/local/hbase/bin/../lib/jsp-2.1-6.1.14.jar:/usr/local/hbase/bin/../lib/jsp-api-2.1-6.1.14.jar:/usr/local/hbase/bin/../lib/jsr311-api-1.1.1.jar:/usr/local/hbase/bin/../lib/junit-3.8.1.jar:/usr/local/hbase/bin/../lib/kfs-0.3.jar:/usr/local/hbase/bin/../lib/log4j-1.2.16.jar:/usr/local/hbase/bin/../lib/oro-2.0.8.jar:/usr/local/hbase/bin/../lib/paranamer-2.2.jar:/usr/local/hbase/bin/../lib/paranamer-ant-2.2.jar:/usr/local/hbase/bin/../lib/paranamer-generator-2.2.jar:/usr/local/hbase/bin/../lib/protobuf-java-2.3.0.jar:/usr/local/hbase/bin/../lib/qdox-1.10.1.jar:/usr/local/hbase/bin/../lib/servlet-api-2.5-6.1.14.jar:/usr/local/hbase/bin/../lib/slf4j-api-1.5.8.jar:/usr/local/hbase/bin/../lib/slf4j-log4j12-1.5.8.jar:/usr/local/hbase/bin/../lib/stax-api-1.0.1.jar:/usr/local/hbase/bin/../lib/thrift-0.2.0.jar:/usr/local/hbase/bin/../lib/xmlenc-0.52.jar:/usr/local/hbase/bin/../lib/zookeeper-3.3.2.jar
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> >
> >
> environment:java.library.path=/usr/java/jre1.6.0_07/lib/amd64/server:/usr/java/jre1.6.0_07/lib/amd64:/usr/java/jre1.6.0_07/../lib/amd64:/usr/java/packages/lib/amd64:/lib:/usr/lib
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:java.io.tmpdir=/tmp
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:java.compiler=<NA>
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:os.name=Linux
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:os.arch=amd64
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:os.version=2.6.18-53.1.21.el5
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:user.name=pubget
> > 2011-01-20 16:47:47,651 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:user.home=/home/pubget
> > 2011-01-20 16:47:47,652 INFO org.apache.zookeeper.ZooKeeper: Client
> > environment:user.dir=/home/pubget/hbase-0.90.0
> > 2011-01-20 16:47:47,652 INFO org.apache.zookeeper.ZooKeeper: Initiating
> > client connection, connectString=ghetto163.host.com:2181,
> > ghetto162.host.com:2181,ghetto161.host.com:2181,ghetto160.host.com:2181,
> > ghetto169.host.com:2181,ghetto167.host.com:2181,ghetto165.host.com:2181,
> > ghetto164.host.com:2181 sessionTimeout=180000 watcher=master:60000
> > 2011-01-20 16:47:47,673 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto167.host.com/10.0.16.167:2181
> > 2011-01-20 16:47:47,678 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto167.host.com/10.0.16.167:2181,
> initiating
> > session
> > 2011-01-20 16:47:47,683 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:47,957 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto163.host.com/10.0.16.163:2181
> > 2011-01-20 16:47:47,957 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto163.host.com/10.0.16.163:2181,
> initiating
> > session
> > 2011-01-20 16:47:47,958 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:48,906 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto169.host.com/10.0.16.169:2181
> > 2011-01-20 16:47:48,907 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto169.host.com/10.0.16.169:2181,
> initiating
> > session
> > 2011-01-20 16:47:48,907 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:49,784 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto160.host.com/10.0.16.160:2181
> > 2011-01-20 16:47:49,785 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto160.host.com/10.0.16.160:2181,
> initiating
> > session
> > 2011-01-20 16:47:49,785 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:50,605 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto162.host.com/10.0.16.162:2181
> > 2011-01-20 16:47:50,606 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto162.host.com/10.0.16.162:2181,
> initiating
> > session
> > 2011-01-20 16:47:50,607 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:50,885 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto161.host.com/10.0.16.161:2181
> > 2011-01-20 16:47:50,886 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto161.host.com/10.0.16.161:2181,
> initiating
> > session
> > 2011-01-20 16:47:50,887 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:51,150 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto164.host.com/10.0.16.164:2181
> > 2011-01-20 16:47:51,150 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto164.host.com/10.0.16.164:2181,
> initiating
> > session
> > 2011-01-20 16:47:51,151 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:52,042 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto165.host.com/10.0.16.165:2181
> > 2011-01-20 16:47:52,042 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto165.host.com/10.0.16.165:2181,
> initiating
> > session
> > 2011-01-20 16:47:52,043 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:52,832 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto167.host.com/10.0.16.167:2181
> > 2011-01-20 16:47:52,833 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto167.host.com/10.0.16.167:2181,
> initiating
> > session
> > 2011-01-20 16:47:52,833 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:53,749 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto163.host.com/10.0.16.163:2181
> > 2011-01-20 16:47:53,750 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto163.host.com/10.0.16.163:2181,
> initiating
> > session
> > 2011-01-20 16:47:53,750 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:54,632 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto169.host.com/10.0.16.169:2181
> > 2011-01-20 16:47:54,632 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto169.host.com/10.0.16.169:2181,
> initiating
> > session
> > 2011-01-20 16:47:54,633 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:55,254 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto160.host.com/10.0.16.160:2181
> > 2011-01-20 16:47:55,254 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto160.host.com/10.0.16.160:2181,
> initiating
> > session
> > 2011-01-20 16:47:55,255 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:55,743 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto162.host.com/10.0.16.162:2181
> > 2011-01-20 16:47:55,743 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto162.host.com/10.0.16.162:2181,
> initiating
> > session
> > 2011-01-20 16:47:55,744 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:56,598 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto161.host.com/10.0.16.161:2181
> > 2011-01-20 16:47:56,598 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto161.host.com/10.0.16.161:2181,
> initiating
> > session
> > 2011-01-20 16:47:56,599 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:57,696 INFO org.apache.zookeeper.ClientCnxn: Opening
> > socket
> > connection to server ghetto164.host.com/10.0.16.164:2181
> > 2011-01-20 16:47:57,697 INFO org.apache.zookeeper.ClientCnxn: Socket
> > connection established to ghetto164.host.com/10.0.16.164:2181,
> initiating
> > session
> > 2011-01-20 16:47:57,697 INFO org.apache.zookeeper.ClientCnxn: Unable to
> > read
> > additional data from server sessionid 0x0, likely server has closed
> socket,
> > closing socket connection and attempting reconnect
> > 2011-01-20 16:47:57,802 ERROR
> > org.apache.hadoop.hbase.master.HMasterCommandLine: Failed to start master
> > java.lang.RuntimeException: Failed construction of Master: class
> > org.apache.hadoop.hbase.master.HMaster
> > at
> > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1057)
> > at
> >
> >
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:142)
> > at
> >
> >
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:102)
> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > at
> >
> >
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> > at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1071)
> > Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException:
> > KeeperErrorCode = ConnectionLoss for /hbase
> > at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
> > at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
> > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
> > at
> >
> >
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
> > at
> >
> >
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:133)
> > at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:217)
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> > at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
> > at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
> > Source)
> > at java.lang.reflect.Constructor.newInstance(Unknown Source)
> > at
> > org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:1052)
> > ... 5 more
> >
>