Again, I really appreciate the help. I removed the master from the region server list and made sure the rest of the machines had an updated list. No region servers still: hbase(main):001:0> zk_dump
HBase tree in ZooKeeper is rooted at /hbase Cluster up? true In safe mode? true Master address: 172.16.1.46:60000 Region server holding ROOT: 172.16.1.46:60020 Region servers: hbase(main):002:0> status 'simple' 0 live servers 0 dead servers I checked the /etc/hosts file on all machines and they all have 127.0.0.1 localhost.localdomain localhost and then their other mappings for other domains, with the box name mapping was removed. There are no regionserver logs. But the master log is this: 2009-11-11 03:02:34,798 INFO org.apache.hadoop.hbase.master.RegionManager: -ROOT- region unset (but not set to be reassigned) 2009-11-11 03:02:34,799 INFO org.apache.hadoop.hbase.master.RegionManager: ROOT inserted into regionsInTransition 2009-11-11 03:02:35,078 INFO org.apache.zookeeper.ClientCnxn: Attempting connection to server chanel2/172.16.1.46:2181 2009-11-11 03:02:35,078 INFO org.apache.zookeeper.ClientCnxn: Priming connection to java.nio.channels.SocketChannel[connected local=/ 172.16.1.46:53335 remote=chanel2/172.16.1.46:2181] 2009-11-11 03:02:35,078 INFO org.apache.zookeeper.ClientCnxn: Server connection successful 2009-11-11 03:02:35,179 INFO org.apache.hadoop.hbase.master.HMaster: HMaster initialized on 172.16.1.46:60000 2009-11-11 03:02:35,197 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=Master, sessionId=HMaster 2009-11-11 03:02:35,198 INFO org.apache.hadoop.hbase.master.metrics.MasterMetrics: Initialized 2009-11-11 03:02:35,373 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 60010 2009-11-11 03:02:35,374 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 60010 webServer.getConnectors()[0].getLocalPort() returned 60010 2009-11-11 03:02:35,374 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 60010 2009-11-11 03:02:52,692 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server Responder: starting 2009-11-11 03:02:52,693 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server listener on 60000: starting 2009-11-11 03:02:52,695 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 0 on 60000: starting 2009-11-11 03:02:52,695 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 1 on 60000: starting 2009-11-11 03:02:52,696 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60000: starting 2009-11-11 03:02:52,696 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 3 on 60000: starting 2009-11-11 03:02:52,696 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 4 on 60000: starting 2009-11-11 03:02:52,697 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 5 on 60000: starting 2009-11-11 03:02:52,697 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 6 on 60000: starting 2009-11-11 03:02:52,697 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 7 on 60000: starting 2009-11-11 03:02:52,698 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 8 on 60000: starting 2009-11-11 03:02:52,698 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 9 on 60000: starting 2009-11-11 03:03:34,719 INFO org.apache.hadoop.hbase.master.ServerManager: 0 region servers, 0 dead, average load NaN 2009-11-11 03:03:35,200 INFO org.apache.hadoop.hbase.master.BaseScanner: All 0 .META. region(s) scanned On Wed, Nov 11, 2009 at 2:39 AM, Jeff Zhang <[email protected]> wrote: > Hi Jean, > > Thank you, after I remove the mapping from sha-cs-03 stuff to localhost it > works. > > But I installed hadoop successfully on these machines before, is hbase > different from hadoop about the ip mapping ? > > > Jeff Zhang > > > > On Wed, Nov 11, 2009 at 1:29 PM, Jean-Daniel Cryans <[email protected] > >wrote: > > > Check your OS networking configuration, make sure stuff don't resolves > > to localhost or 127.0.0.1 or 127.0.1.1 > > > > Also you said you can't run the list, what does it do then? > > > > J-D > > > > On Tue, Nov 10, 2009 at 9:23 PM, Jeff Zhang <[email protected]> wrote: > > > *I configure the regionservers in the file regsionservers as > following:* > > > > > > sha-cs-01 > > > sha-cs-02 > > > sha-cs-03 > > > sha-cs-05 > > > sha-cs-06 > > > > > > *And also I configure the zookeeper in file hbase-site.xml as > following:* > > > > > > <configuration> > > > <property> > > > <name>hbase.cluster.distributed</name> > > > <value>true</value> > > > <description>The mode the cluster will be in. Possible values are > > > false: standalone and pseudo-distributed setups with managed > > Zookeeper > > > true: fully-distributed with unmanaged Zookeeper Quorum (see > > > hbase-env.sh) > > > </description> > > > </property> > > > <property> > > > <name>hbase.zookeeper.property.clientPort</name> > > > <value>2222</value> > > > <description>Property from ZooKeeper's config zoo.cfg. > > > The port at which the clients will connect. > > > </description> > > > </property> > > > <property> > > > <name>hbase.zookeeper.quorum</name> > > > <value>*sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-04,sha-cs-06*</value> > > > <description>Comma separated list of servers in the ZooKeeper > > Quorum. > > > For example, "host1.mydomain.com,host2.mydomain.com, > > host3.mydomain.com > > > ". > > > By default this is set to localhost for local and > pseudo-distributed > > > modes > > > of operation. For a fully-distributed setup, this should be set to > a > > > full > > > list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in > > > hbase-env.sh > > > this is the list of servers which we will start/stop ZooKeeper on. > > > </description> > > > </property> > > > <property> > > > <name>hbase.rootdir</name> > > > <value>hdfs://sha-cs-04:9000/hbase</value> > > > <description>The directory shared by region servers. > > > </description> > > > </property> > > > > > > </configuration> > > > > > > > > > I still do not understand what's wrong with my configuration ? > > > > > > > > > Jeff Zhang > > > > > > > > > > > > On Wed, Nov 11, 2009 at 12:56 PM, Jean-Daniel Cryans < > > [email protected]>wrote: > > > > > >> Please read my answer to Chris (wrote about 10-15 minutes ago), you > > >> also seem to confuse regionservers and zookeeper quorum members. > > >> > > >> In this case it also seems some region servers registered themselves > > >> as localhost and then with their good address the master probably gave > > >> them. Please check your OS network configurations and make sure the > > >> hostname points at the right place. > > >> > > >> J-D > > >> > > >> On Tue, Nov 10, 2009 at 8:47 PM, Jeff Zhang <[email protected]> wrote: > > >> > Hi Jean, > > >> > > > >> > I try the hbase 0.20.2, I look the logs, it seems the master the > > regions > > >> > works. > > >> > > > >> > But I can not run list command on hbase shell. When I invoke command > > >> status > > >> > 'simple' on hbase shell. It shows the following message: > > >> > 09/11/11 12:42:55 DEBUG client.HConnectionManager$ClientZKWatcher: > Got > > >> > ZooKeeper event, state: SyncConnected, type: None, path: null > > >> > 09/11/11 12:42:55 DEBUG zookeeper.ZooKeeperWrapper: Read ZNode > > >> /hbase/master > > >> > got 10.148.224.13:60000 > > >> > 8 servers, 0 dead, 0.1250 average load > > >> > hbase(main):002:0> status 'simple' > > >> > 8 live servers > > >> > localhost:60020 1257914319445 > > >> > requests=0, regions=0, usedHeap=0, maxHeap=0 > > >> > sha-cs-03:60020 1257914321331 > > >> > requests=0, regions=0, usedHeap=33, maxHeap=991 > > >> > localhost:60020 1257914320265 > > >> > requests=0, regions=0, usedHeap=0, maxHeap=0 > > >> > sha-cs-01:60020 1257914320551 > > >> > requests=0, regions=1, usedHeap=34, maxHeap=991 > > >> > sha-cs-05:60020 1257914322656 > > >> > requests=0, regions=0, usedHeap=33, maxHeap=991 > > >> > sha-cs-06:60020 1257914321467 > > >> > requests=0, regions=0, usedHeap=34, maxHeap=991 > > >> > localhost:60020 1257914320202 > > >> > requests=0, regions=0, usedHeap=0, maxHeap=0 > > >> > localhost:60020 1257914321532 > > >> > requests=0, regions=0, usedHeap=0, maxHeap=0 > > >> > > > >> > > > >> > It's weired that why here I have 3 localhost zookeeper, actually I > set > > 5 > > >> > machines on hbase.zookeeper.quorum > > >> > > > >> > > > >> > > > >> > Jeff Zhang > > >> > > > >> > > > >> > > > >> > > > >> > On Wed, Nov 11, 2009 at 9:47 AM, Jean-Daniel Cryans < > > [email protected] > > >> >wrote: > > >> > > > >> >> This particular problem is fixed in the current 0.20 branch and we > > >> >> just released a candidate for 0.20.2, you can get it here > > >> >> http://people.apache.org/~jdcryans/hbase-0.20.2-candidate-1/< > http://people.apache.org/%7Ejdcryans/hbase-0.20.2-candidate-1/> > > <http://people.apache.org/%7Ejdcryans/hbase-0.20.2-candidate-1/> > > >> <http://people.apache.org/%7Ejdcryans/hbase-0.20.2-candidate-1/> > > >> >> > > >> >> J-D > > >> >> > > >> >> On Tue, Nov 10, 2009 at 5:43 PM, Jeff Zhang <[email protected]> > > wrote: > > >> >> > The following is the region server's log : > > >> >> > > > >> >> > > > >> >> > 2009-11-10 18:09:08,062 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 3 on 60020: starting > > >> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 4 on 60020: starting > > >> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 5 on 60020: starting > > >> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 6 on 60020: starting > > >> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 7 on 60020: starting > > >> >> > 2009-11-10 18:09:08,063 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 8 on 60020: starting > > >> >> > 2009-11-10 18:09:08,063 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: HRegionServer > > >> started > > >> >> > at: 10.148.224.11:60020 > > >> >> > 2009-11-10 18:09:08,064 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 9 on 60020: starting > > >> >> > 2009-11-10 18:09:08,070 INFO > > >> >> org.apache.hadoop.hbase.regionserver.StoreFile: > > >> >> > Allocating LruBlockCache with maximum size 198.3m > > >> >> > 2009-11-10 18:09:08,095 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: > > >> >> MSG_CALL_SERVER_STARTUP > > >> >> > 2009-11-10 18:09:08,229 INFO > > >> org.apache.hadoop.hbase.regionserver.HLog: > > >> >> HLog > > >> >> > configuration: blocksize=67108864, rollsize=63753420, > enabled=true, > > >> >> > flushlogentries=100, optionallogflushinternal=10000ms > > >> >> > 2009-11-10 18:09:08,253 INFO > > >> org.apache.hadoop.hbase.regionserver.HLog: > > >> >> New > > >> >> > hlog /hbase/.logs/10.148.224.11 > > >> >> ,60020,1257847748205/hlog.dat.1257847748229 > > >> >> > 2009-11-10 18:09:08,255 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Telling > master > > at > > >> >> > 10.148.224.13:60000 that we are up > > >> >> > 2009-11-10 18:09:08,302 FATAL > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Unhandled > > >> exception. > > >> >> > Aborting... > > >> >> > java.lang.NullPointerException > > >> >> > at > > >> >> > > > >> >> > > >> > > > org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:459) > > >> >> > at java.lang.Thread.run(Thread.java:619) > > >> >> > 2009-11-10 18:09:08,304 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of > > metrics: > > >> >> > request=0.0, regions=0, stores=0, storefiles=0, > > storefileIndexSize=0, > > >> >> > memstoreSize=0, usedHeap=31, maxHeap=99 > > >> >> > 1, blockCacheSize=1707288, blockCacheFree=206264664, > > >> blockCacheCount=0, > > >> >> > blockCacheHitRatio=0 > > >> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: > > >> Stopping > > >> >> > server on 60020 > > >> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 0 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: > > >> Stopping > > >> >> IPC > > >> >> > Server listener on 60020 > > >> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 1 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,304 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 2 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 3 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 4 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 5 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 6 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 7 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 8 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,305 INFO org.apache.hadoop.ipc.HBaseServer: > IPC > > >> >> Server > > >> >> > handler 9 on 60020: exiting > > >> >> > 2009-11-10 18:09:08,306 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping > > >> infoServer > > >> >> > 2009-11-10 18:09:08,307 INFO org.apache.hadoop.ipc.HBaseServer: > > >> Stopping > > >> >> IPC > > >> >> > Server Responder > > >> >> > 2009-11-10 18:09:08,412 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: > > >> >> > regionserver/127.0.0.1:60020.cacheFlusher exiting > > >> >> > 2009-11-10 18:09:08,412 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.LogFlusher: > > >> >> > regionserver/127.0.0.1:60020.logFlusher exiting > > >> >> > 2009-11-10 18:09:08,412 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.CompactSplitThread: > > >> >> > regionserver/127.0.0.1:60020.compactor exiting > > >> >> > 2009-11-10 18:09:08,412 INFO > > >> >> org.apache.hadoop.hbase.regionserver.LogRoller: > > >> >> > LogRoller exiting. > > >> >> > 2009-11-10 18:09:08,413 INFO > > >> >> > > > >> >> > > >> > > > org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker: > > >> >> > regionserver/127.0.0.1:60020.majorCompactionChecker exiting > > >> >> > 2009-11-10 18:09:08,427 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: On abort, > > closed > > >> hlog > > >> >> > 2009-11-10 18:09:08,428 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: aborting > server > > >> at: > > >> >> > 10.148.224.11:60020 > > >> >> > 2009-11-10 18:09:17,489 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread > > >> exiting > > >> >> > 2009-11-10 18:09:17,489 INFO org.apache.zookeeper.ZooKeeper: > > Closing > > >> >> > session: 0x324dcceb05c0003 > > >> >> > 2009-11-10 18:09:17,490 INFO org.apache.zookeeper.ClientCnxn: > > Closing > > >> >> > ClientCnxn for session: 0x324dcceb05c0003 > > >> >> > 2009-11-10 18:09:17,495 INFO org.apache.hadoop.hbase.Leases: > > >> >> > regionserver/127.0.0.1:60020.leaseChecker closing leases > > >> >> > 2009-11-10 18:09:17,495 INFO org.apache.hadoop.hbase.Leases: > > >> >> > regionserver/127.0.0.1:60020.leaseChecker closed leases > > >> >> > 2009-11-10 18:09:17,500 INFO org.apache.zookeeper.ClientCnxn: > > >> Exception > > >> >> > while closing send thread for session 0x324dcceb05c0003 : Read > > error > > >> rc = > > >> >> -1 > > >> >> > java.nio.DirectByteBuffer[pos=0 lim=4 cap=4] > > >> >> > 2009-11-10 18:09:17,604 INFO org.apache.zookeeper.ClientCnxn: > > >> >> Disconnecting > > >> >> > ClientCnxn for session: 0x324dcceb05c0003 > > >> >> > 2009-11-10 18:09:17,604 INFO org.apache.zookeeper.ZooKeeper: > > Session: > > >> >> > 0x324dcceb05c0003 closed > > >> >> > 2009-11-10 18:09:17,605 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver/ > > >> >> > 127.0.0.1:60020 exiting > > >> >> > 2009-11-10 18:09:17,605 INFO org.apache.zookeeper.ClientCnxn: > > >> EventThread > > >> >> > shut down > > >> >> > 2009-11-10 18:09:17,606 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Starting > > shutdown > > >> >> > thread. > > >> >> > 2009-11-10 18:09:17,606 INFO > > >> >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown > thread > > >> >> complete > > >> >> > > > >> >> > On Tue, Nov 10, 2009 at 10:55 PM, Andrew Purtell < > > [email protected] > > >> >> >wrote: > > >> >> > > > >> >> >> When you try to start the region servers, what do you see in the > > log? > > >> >> >> > > >> >> >> If you don't change the client port > > >> >> (hbase.zookeeper.property.clientPort), > > >> >> >> does it work? > > >> >> >> > > >> >> >> - Andy > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> ________________________________ > > >> >> >> From: Jeff Zhang <[email protected]> > > >> >> >> To: [email protected] > > >> >> >> Sent: Tue, November 10, 2009 2:40:28 PM > > >> >> >> Subject: Re: HBase 0.20.1 Distributed Install Problems > > >> >> >> > > >> >> >> Hi, > > >> >> >> > > >> >> >> I meet the same problem that I can not start the regionserver. > > >> >> >> > > >> >> >> When I invoke zk_dump > > >> >> >> > > >> >> >> it shows: > > >> >> >> > > >> >> >> HBase tree in ZooKeeper is rooted at /hbase > > >> >> >> Cluster up? true > > >> >> >> In safe mode? true > > >> >> >> Master address: 10.148.224.13:60000 > > >> >> >> Region server holding ROOT: null > > >> >> >> Region servers: > > >> >> >> > > >> >> >> > > >> >> >> The following is my hbase-site.xml > > >> >> >> > > >> >> >> <configuration> > > >> >> >> <property> > > >> >> >> <name>hbase.cluster.distributed</name> > > >> >> >> <value>true</value> > > >> >> >> <description>The mode the cluster will be in. Possible values > > are > > >> >> >> false: standalone and pseudo-distributed setups with > managed > > >> >> Zookeeper > > >> >> >> true: fully-distributed with unmanaged Zookeeper Quorum > (see > > >> >> >> hbase-env.sh) > > >> >> >> </description> > > >> >> >> </property> > > >> >> >> <property> > > >> >> >> <name>hbase.rootdir</name> > > >> >> >> <value>hdfs://sha-cs-04:9000/hbase</value> > > >> >> >> <description>The directory shared by region servers. > > >> >> >> </description> > > >> >> >> </property> > > >> >> >> <property> > > >> >> >> <name>hbase.zookeeper.property.clientPort</name> > > >> >> >> <value>2222</value> > > >> >> >> <description>Property from ZooKeeper's config zoo.cfg. > > >> >> >> The port at which the clients will connect. > > >> >> >> </description> > > >> >> >> </property> > > >> >> >> <property> > > >> >> >> <name>hbase.zookeeper.quorum</name> > > >> >> >> > > <value>sha-cs-01,sha-cs-02,sha-cs-03,sha-cs-05,sha-cs-06</value> > > >> >> >> <description>Comma separated list of servers in the > ZooKeeper > > >> >> Quorum. > > >> >> >> For example, "host1.mydomain.com,host2.mydomain.com, > > >> >> >> host3.mydomain.com > > >> >> >> ". > > >> >> >> By default this is set to localhost for local and > > >> >> pseudo-distributed > > >> >> >> modes > > >> >> >> of operation. For a fully-distributed setup, this should be > > set > > >> to > > >> >> a > > >> >> >> full > > >> >> >> list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is > set > > in > > >> >> >> hbase-env.sh > > >> >> >> this is the list of servers which we will start/stop > > ZooKeeper > > >> on. > > >> >> >> </description> > > >> >> >> </property> > > >> >> >> > > >> >> >> </configuration> > > >> >> >> > > >> >> >> What's wrong with my configuration ? > > >> >> >> > > >> >> >> > > >> >> >> Thank you in advance. > > >> >> >> > > >> >> >> > > >> >> >> Jeff Zhang > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> On Tue, Nov 10, 2009 at 12:47 PM, Tatsuya Kawano > > >> >> >> <[email protected]>wrote: > > >> >> >> > > >> >> >> > Hello, > > >> >> >> > > > >> >> >> > It looks like the master and the region servers are cannot > > locate > > >> each > > >> >> >> > other. HBase 0.20.x uses ZooKeeper (zk) to locate other > cluster > > >> >> >> > members, so maybe your zk has wrong information. > > >> >> >> > > > >> >> >> > Can you type zk_dump from hbase shell and let us the result? > > >> >> >> > > > >> >> >> > If the cluster is properly configured, you'll get something > like > > >> this: > > >> >> >> > ===================================== > > >> >> >> > hbase(main):007:0> zk_dump > > >> >> >> > > > >> >> >> > HBase tree in ZooKeeper is rooted at /hbase > > >> >> >> > Cluster up? true > > >> >> >> > In safe mode? false > > >> >> >> > Master address: 172.16.80.26:60000 > > >> >> >> > Region server holding ROOT: 172.16.80.27:60020 > > >> >> >> > Region servers: > > >> >> >> > - 172.16.80.27:60020 > > >> >> >> > - 172.16.80.29:60020 > > >> >> >> > - 172.16.80.28:60020 > > >> >> >> > ===================================== > > >> >> >> > > > >> >> >> > > > >> >> >> > > one of my co-workers apparently can log into his box and > > submit > > >> >> jobs, > > >> >> >> but > > >> >> >> > > me or anyone else is still unable to log in. > > >> >> >> > > > >> >> >> > Maybe you're a bit confused; your co-worker seems to be able > to > > use > > >> >> >> > Hadoop Map/Reduce, not HBase. > > >> >> >> > > > >> >> >> > > > >> >> >> > > Does Hbase allow concurrent connections? > > >> >> >> > > > >> >> >> > Yes. > > >> >> >> > > > >> >> >> > > > >> >> >> > >> I think it also says the master is on port 60000 > > >> >> >> > >> when the install directions say its supposed to be 60010? > > >> >> >> > > > >> >> >> > Port 60000 is correct. The master uses port 60000 to accept > > >> connection > > >> >> >> > from hbase shell and region servers. Port 60010 is for the > > >> web-based > > >> >> >> > HBase console. > > >> >> >> > > > >> >> >> > > > >> >> >> > > We tried applying this fix (to explicitly set the master): > > >> >> >> > > > > >> http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html > > >> >> >> > > > >> >> >> > No, this is an old way to configure a cluster. You shouldn't > use > > >> this > > >> >> >> > with HBase 0.20.x > > >> >> >> > > > >> >> >> > > > >> >> >> > Thanks, > > >> >> >> > > > >> >> >> > -- > > >> >> >> > Tatsuya Kawano (Mr.) > > >> >> >> > Tokyo, Japan > > >> >> >> > > > >> >> >> > > > >> >> >> > > > >> >> >> > On Tue, Nov 10, 2009 at 1:10 PM, Chris Bates > > >> >> >> > <[email protected]> wrote: > > >> >> >> > > Another interesting data point. We tried applying this fix > > (to > > >> >> >> > explicitly > > >> >> >> > > set the master): > > >> >> >> > > > > >> http://osdir.com/ml/hbase-user-hadoop-apache/2009-05/msg00321.html > > >> >> >> > > > > >> >> >> > > But when I log in to the master node, it takes really long > to > > >> submit > > >> >> a > > >> >> >> > query > > >> >> >> > > and I get this in response: > > >> >> >> > > hbase(main):001:0> list > > >> >> >> > > NativeException: > > >> >> >> > org.apache.hadoop.hbase.client.RetriesExhaustedException: > > >> >> >> > > Trying to contact region server null for region , row '', > but > > >> failed > > >> >> >> > after 5 > > >> >> >> > > attempts. > > >> >> >> > > Exceptions: > > >> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException: > > Timed > > >> out > > >> >> >> > trying > > >> >> >> > > to locate root region > > >> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException: > > Timed > > >> out > > >> >> >> > trying > > >> >> >> > > to locate root region > > >> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException: > > Timed > > >> out > > >> >> >> > trying > > >> >> >> > > to locate root region > > >> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException: > > Timed > > >> out > > >> >> >> > trying > > >> >> >> > > to locate root region > > >> >> >> > > org.apache.hadoop.hbase.client.NoServerForRegionException: > > Timed > > >> out > > >> >> >> > trying > > >> >> >> > > to locate root region > > >> >> >> > > > > >> >> >> > > from > > >> org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in > > >> >> >> > > `getRegionServerWithRetries' > > >> >> >> > > from org/apache/hadoop/hbase/client/MetaScanner.java:55:in > > >> >> `metaScan' > > >> >> >> > > from org/apache/hadoop/hbase/client/MetaScanner.java:28:in > > >> >> `metaScan' > > >> >> >> > > from > > >> org/apache/hadoop/hbase/client/HConnectionManager.java:432:in > > >> >> >> > > `listTables' > > >> >> >> > > from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in > > >> >> `listTables' > > >> >> >> > > from sun/reflect/NativeMethodAccessorImpl.java:-2:in > > `invoke0' > > >> >> >> > > from sun/reflect/NativeMethodAccessorImpl.java:39:in > `invoke' > > >> >> >> > > from sun/reflect/DelegatingMethodAccessorImpl.java:25:in > > >> `invoke' > > >> >> >> > > from java/lang/reflect/Method.java:597:in `invoke' > > >> >> >> > > from org/jruby/javasupport/JavaMethod.java:298:in > > >> >> >> > > `invokeWithExceptionHandling' > > >> >> >> > > from org/jruby/javasupport/JavaMethod.java:259:in `invoke' > > >> >> >> > > from > org/jruby/java/invokers/InstanceMethodInvoker.java:36:in > > >> >> `call' > > >> >> >> > > from org/jruby/runtime/callsite/CachingCallSite.java:253:in > > >> >> >> > `cacheAndCall' > > >> >> >> > > from org/jruby/runtime/callsite/CachingCallSite.java:72:in > > >> `call' > > >> >> >> > > from org/jruby/ast/CallNoArgNode.java:61:in `interpret' > > >> >> >> > > from org/jruby/ast/ForNode.java:104:in `interpret' > > >> >> >> > > ... 116 levels... > > >> >> >> > > from > > >> >> >> > > > > >> >> >> > > >> >> > > >> > > opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb#start:-1:in > > >> >> >> > > `call' > > >> >> >> > > from > > >> org/jruby/internal/runtime/methods/DynamicMethod.java:226:in > > >> >> >> `call' > > >> >> >> > > from > > >> org/jruby/internal/runtime/methods/CompiledMethod.java:211:in > > >> >> >> `call' > > >> >> >> > > from > > >> org/jruby/internal/runtime/methods/CompiledMethod.java:71:in > > >> >> >> `call' > > >> >> >> > > from org/jruby/runtime/callsite/CachingCallSite.java:253:in > > >> >> >> > `cacheAndCall' > > >> >> >> > > from org/jruby/runtime/callsite/CachingCallSite.java:72:in > > >> `call' > > >> >> >> > > from > > >> >> >> > > > >> >> > > opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:497:in > > >> >> >> > > `__file__' > > >> >> >> > > from > > >> >> >> > > > >> opt/hadoop/hbase_minus_0_dot_20_dot_1/bin/$_dot_dot_/bin/hirb.rb:-1:in > > >> >> >> > > `load' > > >> >> >> > > from org/jruby/Ruby.java:577:in `runScript' > > >> >> >> > > from org/jruby/Ruby.java:480:in `runNormally' > > >> >> >> > > from org/jruby/Ruby.java:354:in `runFromMain' > > >> >> >> > > from org/jruby/Main.java:229:in `run' > > >> >> >> > > from org/jruby/Main.java:110:in `run' > > >> >> >> > > from org/jruby/Main.java:94:in `main' > > >> >> >> > > from /opt/hadoop/hbase-0.20.1/bin/../bin/hirb.rb:338:in > `list' > > >> >> >> > > from (hbase):2hbase(main):002:0> > > >> >> >> > > > > >> >> >> > > > > >> >> >> > > On Mon, Nov 9, 2009 at 10:52 PM, Chris Bates < > > >> >> >> > > [email protected]> wrote: > > >> >> >> > > > > >> >> >> > >> thanks for your response Sujee. These boxes are all on an > > >> internal > > >> >> >> DNS > > >> >> >> > and > > >> >> >> > >> they all resolve. > > >> >> >> > >> > > >> >> >> > >> one of my co-workers apparently can log into his box and > > submit > > >> >> jobs, > > >> >> >> > but > > >> >> >> > >> me or anyone else is still unable to log in. Does Hbase > > allow > > >> >> >> > concurrent > > >> >> >> > >> connections? In Hive I remember having to configure the > > >> metastore > > >> >> to > > >> >> >> be > > >> >> >> > in > > >> >> >> > >> server mode if multiple people were using it. > > >> >> >> > >> > > >> >> >> > >> > > >> >> >> > >> On Mon, Nov 9, 2009 at 10:13 PM, Sujee Maniyam < > > [email protected] > > >> > > > >> >> >> wrote: > > >> >> >> > >> > > >> >> >> > >>> > [had...@crunch hbase-0.20.1]$ bin/start-hbase.sh > > >> >> >> > >>> > > > >> >> >> > >>> > crunch2: Warning: Permanently added 'crunch2' (RSA) to > the > > >> list > > >> >> of > > >> >> >> > known > > >> >> >> > >>> > hosts. > > >> >> >> > >>> > > >> >> >> > >>> > > >> >> >> > >>> is your SSH setup correctly? From master, you need to be > > able > > >> to > > >> >> >> > >>> login to all slaves/regionservers without password > > >> >> >> > >>> > > >> >> >> > >>> And I see you are using short hostnames (crunch2, > crunch3), > > do > > >> >> they > > >> >> >> > >>> all resolve correctly? or you need to update /etc/hosts > to > > >> >> resolve > > >> >> >> > >>> these to an IP address on all machines. > > >> >> >> > >>> > > >> >> >> > >>> regards > > >> >> >> > >>> Sujee Maniyam > > >> >> >> > >>> -- > > >> >> >> > >>> http://sujee.net > > >> >> >> > >>> > > >> >> >> > >> > > >> >> >> > >> > > >> >> >> > > > > >> >> >> > > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> > > > >> >> > > >> > > > >> > > > > > >
