Re: HBase crashed: FATAL HMaster: Shutting down HBase cluster: file system not available

Andrew Purtell Wed, 07 Oct 2009 09:59:12 -0700

Looks like your DFS NameNode became unavailable about the same time that 
ZooKeeper timeouts started happening. Overloading? Anything relevant in the 
NameNode logs?


   - Andy




________________________________
From: Lucas Nazário dos Santos <[email protected]>
To: [email protected]
Sent: Wed, October 7, 2009 9:43:49 AM
Subject: HBase crashed: FATAL HMaster: Shutting down HBase cluster: file  
system not available

Hello,

My HBase cluster crashed today after a couple of days running and the logs
show the exception bellow (end of the message).

Some log excerpts that took my attention are:

2009-10-07 11:25:17,032 ERROR org.apache.hadoop.hbase.master.HMaster: Master
lost its znode, killing itself now
2009-10-07 11:25:17,174 FATAL org.apache.hadoop.hbase.master.HMaster:
Shutting down HBase cluster: file system not available

Any clue on what happened? What could I do to prevent this from occurring in
the future?

Thanks!
Lucas



2009-10-07 11:24:42,823 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.metaScanner scan of 9 row(s) of meta region {server:
192.168.1.3:60020, regionname: .META.,,1, startKey: <>} complete
2009-10-07 11:24:42,823 INFO org.apache.hadoop.hbase.master.BaseScanner: All
1 .META. region(s) scanned
2009-10-07 11:25:06,311 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x1242b188e8a0001 to sun.nio.ch.selectionkeyi...@148c02f
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-10-07 11:25:06,702 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-10-07 11:25:06,702 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:49602 remote=server2/192.168.1.3:2181]
2009-10-07 11:25:06,703 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-10-07 11:25:16,911 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x242b1890c70000 to sun.nio.ch.selectionkeyi...@1060478
java.io.IOException: Read error rc = -1 java.nio.DirectByteBuffer[pos=0
lim=4 cap=4]
        at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:653)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
2009-10-07 11:25:16,911 INFO org.apache.hadoop.hbase.master.ServerManager:
server2,60020,1254853514050 znode expired
2009-10-07 11:25:17,021 INFO org.apache.hadoop.hbase.master.RegionManager:
META region removed from onlineMetaRegions
2009-10-07 11:25:17,032 ERROR org.apache.hadoop.hbase.master.HMaster: Master
lost its znode, killing itself now
2009-10-07 11:25:17,032 INFO
org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of
server server2,60020,1254853514050: logSplit: false, rootRescanned: false,
numberOfMetaRegions: 1, onlineMetaRegions.size(): 0
2009-10-07 11:25:17,174 FATAL org.apache.hadoop.hbase.master.HMaster:
Shutting down HBase cluster: file system not available
java.io.IOException: File system is not available
        at
org.apache.hadoop.hbase.util.FSUtils.checkFileSystemAvailable(FSUtils.java:125)
        at
org.apache.hadoop.hbase.master.HMaster.checkFileSystem(HMaster.java:324)
        at
org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:525)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:426)
Caused by: java.io.IOException: Filesystem closed
        at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:197)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:585)
        at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:453)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:643)
        at
org.apache.hadoop.hbase.util.FSUtils.checkFileSystemAvailable(FSUtils.java:114)
        ... 3 more
2009-10-07 11:25:17,174 INFO org.apache.hadoop.hbase.master.HMaster:
Stopping infoServer

Re: HBase crashed: FATAL HMaster: Shutting down HBase cluster: file system not available

Reply via email to