Re: HBase 0.20.1 Distributed Install Problems

Tatsuya Kawano Tue, 10 Nov 2009 02:23:35 -0800

Hi,

> When I invoke zk_dump
>
> it shows:
>
> HBase tree in ZooKeeper is rooted at /hbase
>  Cluster up? true
>  In safe mode? true
>  Master address: 10.148.224.13:60000
>  Region server holding ROOT: null
>  Region servers:


So, in your case, your region server(s) didn't report its address to
ZooKeeper (ZK). Possible reasons will be:

Case 1. start-hbase.sh command couldn't ssh to the server machine of
the region server.
Case 2. start-hbase.sh ran fine, but the region server was failed to start up.
Case 3. The region server did start up, but couldn't reach ZK.

Please check the regionserver log under logs/ directory to see if it
has some error messages. There might be some clues in the log why the
region server was not reporting its address to ZK.

Thanks,

-- 
Tatsuya Kawano (Mr.)
Tokyo, Japan


On Tue, Nov 10, 2009 at 3:40 PM, Jeff Zhang <[email protected]> wrote:
> Hi,
>
> I meet the same problem that I can not start the regionserver.
>
> When I invoke zk_dump
>
> it shows:
>
> HBase tree in ZooKeeper is rooted at /hbase
>  Cluster up? true
>  In safe mode? true
>  Master address: 10.148.224.13:60000
>  Region server holding ROOT: null
>  Region servers:
>
>
> The following is my hbase-site.xml
>
> <configuration>
>  <property>
>    <name>hbase.cluster.distributed</name>
>    <value>true</value>
>    <description>The mode the cluster will be in. Possible values are
>      false: standalone and pseudo-distributed setups with managed Zookeeper
>      true: fully-distributed with unmanaged Zookeeper Quorum (see
> hbase-env.sh)
>    </description>
>  </property>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://sha-cs-04:9000/hbase</value>
>    <description>The directory shared by region servers.
>    </description>
>  </property>
>  <property>
>      <name>hbase.zookeeper.property.clientPort</name>
>      <value>2222</value>
>      <description>Property from ZooKeeper's config zoo.cfg.
>      The port at which the clients will connect.
>      </description>
>   </property>
>   <property>
>      <name>hbase.zookeeper.quorum</name>
>      <value>sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-05,sha-cs-06</value>
>      <description>Comma separated list of servers in the ZooKeeper Quorum.
>      For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com
> ".
>      By default this is set to localhost for local and pseudo-distributed
> modes
>      of operation. For a fully-distributed setup, this should be set to a
> full
>      list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in
> hbase-env.sh
>      this is the list of servers which we will start/stop ZooKeeper on.
>      </description>
>    </property>
>
> </configuration>
>
> What's wrong with my configuration ?
>
>
> Thank you in advance.
>
>
> Jeff Zhang
>
>
>
> On Tue, Nov 10, 2009 at 12:47 PM, Tatsuya Kawano
> <[email protected]>wrote:
>
>> Hello,
>>
>> It looks like the master and the region servers are cannot locate each
>> other. HBase 0.20.x uses ZooKeeper (zk) to locate other cluster
>> members, so maybe your zk has wrong information.
>>
>> Can you type zk_dump from hbase shell and let us the result?
>>
>> If the cluster is properly configured, you'll get something like this:
>> =====================================
>> hbase(main):007:0> zk_dump
>>
>> HBase tree in ZooKeeper is rooted at /hbase
>>  Cluster up? true
>>  In safe mode? false
>>  Master address: 172.16.80.26:60000
>>  Region server holding ROOT: 172.16.80.27:60020
>>  Region servers:
>>   - 172.16.80.27:60020
>>   - 172.16.80.29:60020
>>   - 172.16.80.28:60020
>> =====================================
>>
>>
>> > one of my co-workers apparently can log into his box and submit jobs, but
>> > me or anyone else is still unable to log in.
>>
>> Maybe you're a bit confused; your co-worker seems to be able to use
>> Hadoop Map/Reduce, not HBase.
>>
>>
>> > Does Hbase allow concurrent connections?
>>
>> Yes.
>>
>>
>> >> I think it also says the master is on port 60000
>> >> when the install directions say its supposed to be 60010?
>>
>> Port 60000 is correct. The master uses port 60000 to accept connection
>> from hbase shell and region servers. Port 60010 is for the web-based
>> HBase console.
>>
>>
>> > We tried applying this fix (to explicitly set the master):
>> > http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html
>>
>> No, this is an old way to configure a cluster. You shouldn't use this
>> with HBase 0.20.x
>>
>>
>> Thanks,
>>
>> --
>> Tatsuya Kawano (Mr.)
>> Tokyo, Japan
>>
>>
>>
>> On Tue, Nov 10, 2009 at 1:10 PM, Chris Bates
>> <[email protected]> wrote:
>> > Another interesting data point.  We tried applying this fix (to
>> explicitly
>> > set the master):
>> > http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html
>> >
>> > But when I log in to the master node, it takes really long to submit a
>> query
>> > and I get this in response:
>> > hbase(main):001:0> list
>> > NativeException:
>> org.apache.hadoop.hbase.client.RetriesExhaustedException:
>> > Trying to contact region server null for region , row '', but failed
>> after 5
>> > attempts.
>> > Exceptions:
>> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
>> trying
>> > to locate root region
>> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
>> trying
>> > to locate root region
>> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
>> trying
>> > to locate root region
>> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
>> trying
>> > to locate root region
>> > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out
>> trying
>> > to locate root region
>> >
>> > from org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in
>> > `getRegionServerWithRetries'
>> >  from org/apache/hadoop/hbase/client/MetaScanner.java:55:in `metaScan'
>> > from org/apache/hadoop/hbase/client/MetaScanner.java:28:in `metaScan'
>> >  from org/apache/hadoop/hbase/client/HConnectionManager.java:432:in
>> > `listTables'
>> > from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in `listTables'
>> >  from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
>> > from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
>> >  from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
>> > from java/lang/reflect/Method.java:597:in `invoke'
>> >  from org/jruby/javasupport/JavaMethod.java:298:in
>> > `invokeWithExceptionHandling'
>> > from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
>> >  from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in `call'
>> > from org/jruby/runtime/callsite/CachingCallSite.java:253:in
>> `cacheAndCall'
>> >  from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
>> > from org/jruby/ast/CallNoArgNode.java:61:in `interpret'
>> >  from org/jruby/ast/ForNode.java:104:in `interpret'
>> > ... 116 levels...
>> > from
>> > opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb#start:-1:in
>> > `call'
>> >  from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call'
>> > from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call'
>> >  from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call'
>> > from org/jruby/runtime/callsite/CachingCallSite.java:253:in
>> `cacheAndCall'
>> >  from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
>> > from
>> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:497:in
>> > `__file__'
>> >  from
>> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:-1:in
>> > `load'
>> > from org/jruby/Ruby.java:577:in `runScript'
>> >  from org/jruby/Ruby.java:480:in `runNormally'
>> > from org/jruby/Ruby.java:354:in `runFromMain'
>> >  from org/jruby/Main.java:229:in `run'
>> > from org/jruby/Main.java:110:in `run'
>> >  from org/jruby/Main.java:94:in `main'
>> > from /opt/hadoop/hbase-0.20.1/bin/../bin/hirb.rb:338:in `list'
>> >  from (hbase):2hbase(main):002:0>
>> >
>> >
>> > On Mon, Nov 9, 2009 at 10:52 PM, Chris Bates <
>> > [email protected]> wrote:
>> >
>> >> thanks for your response Sujee.  These boxes are all on an internal DNS
>> and
>> >> they all resolve.
>> >>
>> >> one of my co-workers apparently can log into his box and submit jobs,
>> but
>> >> me or anyone else is still unable to log in.  Does Hbase allow
>> concurrent
>> >> connections?  In Hive I remember having to configure the metastore to be
>> in
>> >> server mode if multiple people were using it.
>> >>
>> >>
>> >> On Mon, Nov 9, 2009 at 10:13 PM, Sujee Maniyam <[email protected]> wrote:
>> >>
>> >>> > [had...@crunch hbase-0.20.1]$ bin/start-hbase.sh
>> >>> >
>> >>> > crunch2: Warning: Permanently added 'crunch2' (RSA) to the list of
>> known
>> >>> > hosts.
>> >>>
>> >>>
>> >>> is your SSH setup correctly?  From master, you need to be able to
>> >>> login to all slaves/regionservers without password
>> >>>
>> >>> And I see you are using short hostnames (crunch2, crunch3), do they
>> >>> all resolve correctly?  or you need to update /etc/hosts to resolve
>> >>> these to an IP address on all machines.
>> >>>
>> >>> regards
>> >>> Sujee Maniyam
>> >>> --
>> >>> http://sujee.net

Re: HBase 0.20.1 Distributed Install Problems

Reply via email to