It can take hdfs a while before it leaves hdfs. Is this what happened? Usually hbase will wait on hdfs to leave safe mode. If you look in your logs, can you figure what happened around hbase startup? It didn't wait long enough? St.Ack
On Mon, Mar 8, 2010 at 11:03 PM, Ted Yu <yuzhih...@gmail.com> wrote: > This happened after we restarted our servers. > The load on the servers was light. > We have 3 data nodes. > > On Monday, March 8, 2010, Stack <st...@duboce.net> wrote: >> Your namenode flipped your hdfs into safe mode -- i.e. read-only mode. >> This happens on startup -- did you restart the hdfs under your >> hbase? -- or it can happen if hdfs suffers extreme duress such as >> losing a good proportion of all datanodes. Did something like the >> latter happen in your case? You seem to have 3 hbase nodes. Do you >> have 3 datanodes only? What kind of a loading were you running? >> >> St.Ack >> >> On Mon, Mar 8, 2010 at 10:02 PM, Ted Yu <yuzhih...@gmail.com> wrote: >>> Hi, >>> I saw this in master server log: >>> 2010-03-08 21:13:47,428 INFO [Thread-14] >>> master.ServerManager$ServerMonitor(130): 3 region servers, 0 dead, average >>> load 0.0 >>> 2010-03-08 21:13:50,505 INFO [WrapperSimpleAppMain-EventThread] >>> master.ServerManager$ServerExpirer(813): >>> snv-it-lin-010.projectrialto.com,60020,1268109747635 >>> znode expired >>> 2010-03-08 21:13:52,854 DEBUG [HMaster] regionserver.HLog(912): Pushed=50725 >>> entries from hdfs:// >>> snv-it-lin-006.projectrialto.com:9000/hbase/.logs/snv-it-lin-011.projectrialto.com,60020,1267695848509/hlog.dat.1268083819081 >>> 2010-03-08 21:13:52,856 DEBUG [HMaster] regionserver.HLog(885): Splitting >>> hlog 5 of 21: hdfs:// >>> snv-it-lin-006.projectrialto.com:9000/hbase/.logs/snv-it-lin-011.projectrialto.com,60020,1267695848509/hlog.dat.1268083833046, >>> length=58788942 >>> 2010-03-08 21:14:47,441 INFO [Thread-14] >>> master.ServerManager$ServerMonitor(130): 2 region servers, 1 dead, average >>> load 0.0[snv-it-lin-010.projectrialto.com,60020,1268109747635] >>> .... >>> 2010-03-08 22:01:10,078 DEBUG [HMaster] regionserver.HLog(1024): Waiting for >>> hlog writers to terminate, iteration #143 >>> 2010-03-08 22:01:15,080 DEBUG [HMaster] regionserver.HLog(1024): Waiting for >>> hlog writers to terminate, iteration #144 >>> 2010-03-08 22:01:20,082 DEBUG [HMaster] regionserver.HLog(1024): Waiting for >>> hlog writers to terminate, iteration #145 >>> 2010-03-08 22:01:25,085 DEBUG [HMaster] regionserver.HLog(1024): Waiting for >>> hlog writers to terminate, iteration #146 >>> 2010-03-08 22:01:30,087 DEBUG [HMaster] regionserver.HLog(1024): Waiting for >>> hlog writers to terminate, iteration #147 >>> >>> >>> And this in region server log on snv-it-lin-011: >>> org.apache.hadoop.ipc.RemoteException: >>> org.apache.hadoop.hdfs.server.namenode.SafeModeException: >>> Cannot renew lease for DFSClient_-1882710079. Name node is in safe mode. >>> The ratio of reported blocks 1.0000 has reached the threshold 0.9990. Safe >>> mode will be turned off automatically in 1 seconds. >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewLease(FSNamesystem.java:1972) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNode.renewLease(NameNode.java:550) >>> at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) >>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) >>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:396) >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) >>> >>> at org.apache.hadoop.ipc.Client.call(Client.java:739) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) >>> at $Proxy1.renewLease(Unknown Source) >>> at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> >