Hi All,
I have a problem in my hbase fully distributed mode with four node
cluster. I am using two master in my configuration, one is active master and
another one is the backup master .
i) If I stop the hbase by using the stop-hbase.sh command the log printed in
the end of my master log is
2011-08-08 16:05:04,897 INFO
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
rohinis.zohocorpin.com:60000.timeoutMonitor exiting
2011-08-08 16:05:04,897 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
Closed zookeeper sessionid=0x231a8f181f60000
2011-08-08 16:05:04,907 INFO org.apache.zookeeper.ClientCnxn: EventThread shut
down
2011-08-08 16:05:04,907 INFO org.apache.zookeeper.ZooKeeper: Session:
0x231a8f181f60000 closed
2011-08-08 16:05:04,914 INFO org.apache.zookeeper.ClientCnxn: EventThread shut
down
2011-08-08 16:05:04,915 INFO org.apache.zookeeper.ZooKeeper: Session:
0x131a8f11a570000 closed
2011-08-08 16:05:04,915 INFO org.apache.hadoop.hbase.master.HMaster: HMaster
main thread exiting
-------------------------
ii) If I kill the master by using the kill 15914 or kill -9 15914
no logs printed in my master log
-------------------------
iii) If I stop the master by using ./bin/hbase-daemon.sh stop master command
the log printed in the end of my master log is
2011-08-08 16:46:03,035 INFO org.apache.hadoop.hbase.master.AssignmentManager:
Bulk assigning done
2011-08-08 16:46:03,037 INFO org.apache.hadoop.hbase.master.HMaster: Master has
completed initialization
2011-08-08 16:46:03,045 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor:
Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
Mon Aug 8 16:49:54 IST 2011 Killing master
--------------------------
In the (i) case the whole hbase cluster is stopped.
In the (ii) case the master only killed but the Regionservers are not assign to
the backup master and the backup master is waiting for ZNode to be written
In the (iii) case also the master only killed but the Regionservers are not
assign to the backup master and the backup master is waiting for the ZNode to
be written
In the (ii) and (iii) cases, Is the master properly killed?
If the master is properly killed, than why the region servers are unable to
connect to the backup master ?
If the master is not properly killed, than how to kill the process of master
for test this environment ?
-----------------------------
My Regionserver log is while kill -9 (master process)
2011-08-08 16:48:20,987 INFO org.apache.hadoop.ipc.HBaseServer: PRI IPC Server
handler 9 on 60020: starting
2011-08-08 16:48:20,987 INFO org.apache.hadoop.hbase.regionserver.StoreFile:
Allocating LruBlockCache with maximum size 199.4m
2011-08-08 16:48:23,901 INFO org.apache.hadoop.hbase.zookeeper.MetaNodeTracker:
Detected completed assignment of META, notifying catalog tracker
2011-08-08 16:48:23,934 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Received request to open 0
region(s)
2011-08-08 16:53:18,263 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to
master. Retrying. Error was:
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:328)
at
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:883)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:750)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
at $Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getMaster(HRegionServer.java:1445)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:737)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:586)
at java.lang.Thread.run(Thread.java:636)
2011-08-08 16:53:20,992 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache:
LRU Stats: total=957.86 KB, free=198.43 MB, max=199.36 MB, blocks=0,
accesses=0, hits=0, hitRatio=�%, cachingAccesses=0, cachingHits=0,
cachingHitsRatio=�%, evictions=0, evicted=0, evictedPerRun=NaN
2011-08-08 16:54:21,349 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Unable to connect to
master. Retrying. Error was:
java.net.ConnectException: Connection refused
---------------------------
My Backup master log in all time
2011-08-08 16:48:25,697 INFO org.apache.hadoop.hbase.metrics: MetricsString
added: url
2011-08-08 16:48:25,697 INFO org.apache.hadoop.hbase.metrics: MetricsString
added: version
2011-08-08 16:48:25,697 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
2011-08-08 16:48:25,697 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
2011-08-08 16:48:25,697 INFO
org.apache.hadoop.hbase.master.metrics.MasterMetrics: Initialized
2011-08-08 16:48:25,698 DEBUG org.apache.hadoop.hbase.master.HMaster: HMaster
started in backup mode. Stalling until master znode is written.
2011-08-08 16:48:25,698 DEBUG org.apache.hadoop.hbase.master.HMaster: Waiting
for master address ZNode to be written (Also watching cluster state node)
2011-08-08 16:51:25,698 DEBUG org.apache.hadoop.hbase.master.HMaster: Waiting
for master address ZNode to be written (Also watching cluster state node)
2011-08-08 16:54:25,698 DEBUG org.apache.hadoop.hbase.master.HMaster: Waiting
for master address ZNode to be written (Also watching cluster state node)
2011-08-08 16:57:25,698 DEBUG org.apache.hadoop.hbase.master.HMaster: Waiting
for master address ZNode to be written (Also watching cluster state node)
Thanks in Advance for your valuable suggestions..................
Regards,
Shanmuganathan