Hi, > When I invoke zk_dump > > it shows: > > HBase tree in ZooKeeper is rooted at /hbase > Cluster up? true > In safe mode? true > Master address: 10.148.224.13:60000 > Region server holding ROOT: null > Region servers:
So, in your case, your region server(s) didn't report its address to ZooKeeper (ZK). Possible reasons will be: Case 1. start-hbase.sh command couldn't ssh to the server machine of the region server. Case 2. start-hbase.sh ran fine, but the region server was failed to start up. Case 3. The region server did start up, but couldn't reach ZK. Please check the regionserver log under logs/ directory to see if it has some error messages. There might be some clues in the log why the region server was not reporting its address to ZK. Thanks, -- Tatsuya Kawano (Mr.) Tokyo, Japan On Tue, Nov 10, 2009 at 3:40 PM, Jeff Zhang <[email protected]> wrote: > Hi, > > I meet the same problem that I can not start the regionserver. > > When I invoke zk_dump > > it shows: > > HBase tree in ZooKeeper is rooted at /hbase > Cluster up? true > In safe mode? true > Master address: 10.148.224.13:60000 > Region server holding ROOT: null > Region servers: > > > The following is my hbase-site.xml > > <configuration> > <property> > <name>hbase.cluster.distributed</name> > <value>true</value> > <description>The mode the cluster will be in. Possible values are > false: standalone and pseudo-distributed setups with managed Zookeeper > true: fully-distributed with unmanaged Zookeeper Quorum (see > hbase-env.sh) > </description> > </property> > <property> > <name>hbase.rootdir</name> > <value>hdfs://sha-cs-04:9000/hbase</value> > <description>The directory shared by region servers. > </description> > </property> > <property> > <name>hbase.zookeeper.property.clientPort</name> > <value>2222</value> > <description>Property from ZooKeeper's config zoo.cfg. > The port at which the clients will connect. > </description> > </property> > <property> > <name>hbase.zookeeper.quorum</name> > <value>sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-05,sha-cs-06</value> > <description>Comma separated list of servers in the ZooKeeper Quorum. > For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com > ". > By default this is set to localhost for local and pseudo-distributed > modes > of operation. For a fully-distributed setup, this should be set to a > full > list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in > hbase-env.sh > this is the list of servers which we will start/stop ZooKeeper on. > </description> > </property> > > </configuration> > > What's wrong with my configuration ? > > > Thank you in advance. > > > Jeff Zhang > > > > On Tue, Nov 10, 2009 at 12:47 PM, Tatsuya Kawano > <[email protected]>wrote: > >> Hello, >> >> It looks like the master and the region servers are cannot locate each >> other. HBase 0.20.x uses ZooKeeper (zk) to locate other cluster >> members, so maybe your zk has wrong information. >> >> Can you type zk_dump from hbase shell and let us the result? >> >> If the cluster is properly configured, you'll get something like this: >> ===================================== >> hbase(main):007:0> zk_dump >> >> HBase tree in ZooKeeper is rooted at /hbase >> Cluster up? true >> In safe mode? false >> Master address: 172.16.80.26:60000 >> Region server holding ROOT: 172.16.80.27:60020 >> Region servers: >> - 172.16.80.27:60020 >> - 172.16.80.29:60020 >> - 172.16.80.28:60020 >> ===================================== >> >> >> > one of my co-workers apparently can log into his box and submit jobs, but >> > me or anyone else is still unable to log in. >> >> Maybe you're a bit confused; your co-worker seems to be able to use >> Hadoop Map/Reduce, not HBase. >> >> >> > Does Hbase allow concurrent connections? >> >> Yes. >> >> >> >> I think it also says the master is on port 60000 >> >> when the install directions say its supposed to be 60010? >> >> Port 60000 is correct. The master uses port 60000 to accept connection >> from hbase shell and region servers. Port 60010 is for the web-based >> HBase console. >> >> >> > We tried applying this fix (to explicitly set the master): >> > http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html >> >> No, this is an old way to configure a cluster. You shouldn't use this >> with HBase 0.20.x >> >> >> Thanks, >> >> -- >> Tatsuya Kawano (Mr.) >> Tokyo, Japan >> >> >> >> On Tue, Nov 10, 2009 at 1:10 PM, Chris Bates >> <[email protected]> wrote: >> > Another interesting data point. We tried applying this fix (to >> explicitly >> > set the master): >> > http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html >> > >> > But when I log in to the master node, it takes really long to submit a >> query >> > and I get this in response: >> > hbase(main):001:0> list >> > NativeException: >> org.apache.hadoop.hbase.client.RetriesExhaustedException: >> > Trying to contact region server null for region , row '', but failed >> after 5 >> > attempts. >> > Exceptions: >> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out >> trying >> > to locate root region >> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out >> trying >> > to locate root region >> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out >> trying >> > to locate root region >> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out >> trying >> > to locate root region >> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out >> trying >> > to locate root region >> > >> > from org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in >> > `getRegionServerWithRetries' >> > from org/apache/hadoop/hbase/client/MetaScanner.java:55:in `metaScan' >> > from org/apache/hadoop/hbase/client/MetaScanner.java:28:in `metaScan' >> > from org/apache/hadoop/hbase/client/HConnectionManager.java:432:in >> > `listTables' >> > from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in `listTables' >> > from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0' >> > from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke' >> > from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke' >> > from java/lang/reflect/Method.java:597:in `invoke' >> > from org/jruby/javasupport/JavaMethod.java:298:in >> > `invokeWithExceptionHandling' >> > from org/jruby/javasupport/JavaMethod.java:259:in `invoke' >> > from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in `call' >> > from org/jruby/runtime/callsite/CachingCallSite.java:253:in >> `cacheAndCall' >> > from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call' >> > from org/jruby/ast/CallNoArgNode.java:61:in `interpret' >> > from org/jruby/ast/ForNode.java:104:in `interpret' >> > ... 116 levels... >> > from >> > opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb#start:-1:in >> > `call' >> > from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call' >> > from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call' >> > from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call' >> > from org/jruby/runtime/callsite/CachingCallSite.java:253:in >> `cacheAndCall' >> > from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call' >> > from >> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:497:in >> > `__file__' >> > from >> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:-1:in >> > `load' >> > from org/jruby/Ruby.java:577:in `runScript' >> > from org/jruby/Ruby.java:480:in `runNormally' >> > from org/jruby/Ruby.java:354:in `runFromMain' >> > from org/jruby/Main.java:229:in `run' >> > from org/jruby/Main.java:110:in `run' >> > from org/jruby/Main.java:94:in `main' >> > from /opt/hadoop/hbase-0.20.1/bin/../bin/hirb.rb:338:in `list' >> > from (hbase):2hbase(main):002:0> >> > >> > >> > On Mon, Nov 9, 2009 at 10:52 PM, Chris Bates < >> > [email protected]> wrote: >> > >> >> thanks for your response Sujee. These boxes are all on an internal DNS >> and >> >> they all resolve. >> >> >> >> one of my co-workers apparently can log into his box and submit jobs, >> but >> >> me or anyone else is still unable to log in. Does Hbase allow >> concurrent >> >> connections? In Hive I remember having to configure the metastore to be >> in >> >> server mode if multiple people were using it. >> >> >> >> >> >> On Mon, Nov 9, 2009 at 10:13 PM, Sujee Maniyam <[email protected]> wrote: >> >> >> >>> > [had...@crunch hbase-0.20.1]$ bin/start-hbase.sh >> >>> > >> >>> > crunch2: Warning: Permanently added 'crunch2' (RSA) to the list of >> known >> >>> > hosts. >> >>> >> >>> >> >>> is your SSH setup correctly? From master, you need to be able to >> >>> login to all slaves/regionservers without password >> >>> >> >>> And I see you are using short hostnames (crunch2, crunch3), do they >> >>> all resolve correctly? or you need to update /etc/hosts to resolve >> >>> these to an IP address on all machines. >> >>> >> >>> regards >> >>> Sujee Maniyam >> >>> -- >> >>> http://sujee.net
