Hello, It looks like the master and the region servers are cannot locate each other. HBase 0.20.x uses ZooKeeper (zk) to locate other cluster members, so maybe your zk has wrong information.
Can you type zk_dump from hbase shell and let us the result? If the cluster is properly configured, you'll get something like this: ===================================== hbase(main):007:0> zk_dump HBase tree in ZooKeeper is rooted at /hbase Cluster up? true In safe mode? false Master address: 172.16.80.26:60000 Region server holding ROOT: 172.16.80.27:60020 Region servers: - 172.16.80.27:60020 - 172.16.80.29:60020 - 172.16.80.28:60020 ===================================== > one of my co-workers apparently can log into his box and submit jobs, but > me or anyone else is still unable to log in. Maybe you're a bit confused; your co-worker seems to be able to use Hadoop Map/Reduce, not HBase. > Does Hbase allow concurrent connections? Yes. >> I think it also says the master is on port 60000 >> when the install directions say its supposed to be 60010? Port 60000 is correct. The master uses port 60000 to accept connection from hbase shell and region servers. Port 60010 is for the web-based HBase console. > We tried applying this fix (to explicitly set the master): > http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html No, this is an old way to configure a cluster. You shouldn't use this with HBase 0.20.x Thanks, -- Tatsuya Kawano (Mr.) Tokyo, Japan On Tue, Nov 10, 2009 at 1:10 PM, Chris Bates <[email protected]> wrote: > Another interesting data point. We tried applying this fix (to explicitly > set the master): > http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html > > But when I log in to the master node, it takes really long to submit a query > and I get this in response: > hbase(main):001:0> list > NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: > Trying to contact region server null for region , row '', but failed after 5 > attempts. > Exceptions: > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying > to locate root region > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying > to locate root region > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying > to locate root region > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying > to locate root region > org.apache.hadoop.hbase.client.NoServerForRegionException: Timed out trying > to locate root region > > from org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in > `getRegionServerWithRetries' > from org/apache/hadoop/hbase/client/MetaScanner.java:55:in `metaScan' > from org/apache/hadoop/hbase/client/MetaScanner.java:28:in `metaScan' > from org/apache/hadoop/hbase/client/HConnectionManager.java:432:in > `listTables' > from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in `listTables' > from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0' > from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke' > from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke' > from java/lang/reflect/Method.java:597:in `invoke' > from org/jruby/javasupport/JavaMethod.java:298:in > `invokeWithExceptionHandling' > from org/jruby/javasupport/JavaMethod.java:259:in `invoke' > from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in `call' > from org/jruby/runtime/callsite/CachingCallSite.java:253:in `cacheAndCall' > from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call' > from org/jruby/ast/CallNoArgNode.java:61:in `interpret' > from org/jruby/ast/ForNode.java:104:in `interpret' > ... 116 levels... > from > opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb#start:-1:in > `call' > from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in `call' > from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in `call' > from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in `call' > from org/jruby/runtime/callsite/CachingCallSite.java:253:in `cacheAndCall' > from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call' > from opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:497:in > `__file__' > from opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:-1:in > `load' > from org/jruby/Ruby.java:577:in `runScript' > from org/jruby/Ruby.java:480:in `runNormally' > from org/jruby/Ruby.java:354:in `runFromMain' > from org/jruby/Main.java:229:in `run' > from org/jruby/Main.java:110:in `run' > from org/jruby/Main.java:94:in `main' > from /opt/hadoop/hbase-0.20.1/bin/../bin/hirb.rb:338:in `list' > from (hbase):2hbase(main):002:0> > > > On Mon, Nov 9, 2009 at 10:52 PM, Chris Bates < > [email protected]> wrote: > >> thanks for your response Sujee. These boxes are all on an internal DNS and >> they all resolve. >> >> one of my co-workers apparently can log into his box and submit jobs, but >> me or anyone else is still unable to log in. Does Hbase allow concurrent >> connections? In Hive I remember having to configure the metastore to be in >> server mode if multiple people were using it. >> >> >> On Mon, Nov 9, 2009 at 10:13 PM, Sujee Maniyam <[email protected]> wrote: >> >>> > [had...@crunch hbase-0.20.1]$ bin/start-hbase.sh >>> > >>> > crunch2: Warning: Permanently added 'crunch2' (RSA) to the list of known >>> > hosts. >>> >>> >>> is your SSH setup correctly? From master, you need to be able to >>> login to all slaves/regionservers without password >>> >>> And I see you are using short hostnames (crunch2, crunch3), do they >>> all resolve correctly? or you need to update /etc/hosts to resolve >>> these to an IP address on all machines. >>> >>> regards >>> Sujee Maniyam >>> -- >>> http://sujee.net >>> >> >> >
