Agreed. I've seen similar issues when upon startup where for whatever reason an hlog (often empty) can't be read, which hangs the startup process. Manually deleting it from HDFS clears the issue.
On Tue, Apr 12, 2011 at 10:01 AM, Jinsong Hu <[email protected]> wrote: > You probably should stop all master/regionservers, then start one master, > tail -f the log to confirm all the hlogs are handled, > > then start the first regionserver, and then other regionservers. > > I have encountered this issues before. > hbase is not as good as what you want, but not as bad as you say either. The > truth is in between. > > Jimmy > > -------------------------------------------------- > From: "Robert Gonzalez" <[email protected]> > Sent: Tuesday, April 12, 2011 9:49 AM > To: <[email protected]> > Subject: HBase is not ready for Primetime > >> We've been using HBase for about a year, consistenly running into >> problems where we lost data. After reading forums and some back and >> forth with other Hbase users, we changed our data methodology to save >> less data per row. This last time, we upgraded to 0.90 at the >> recommendation of the hbase community, cleared off all our data, and >> started over. Seemed to be running ok for a couple of months, until >> this morning. One of the regionservers stopped responding to data >> requests and we tried to restart it to no avail. Then we shutdown our >> processes so that nothing was using HBase and we shut down HBase and >> brought it back up. We waited a little bit, until hbase status >> indicated that all the servers were back up. We turned on our >> processes and lo and behold, HBase is broken, getting >> org.apache.hadoop.hbase. >> NotServingRegionException: >> org.apache.hadoop.hbase.NotServingRegionException: Region is not >> online: -ROOT-,,0 >> at >> org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:2319) >> at >> org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1607) >> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) >> at >> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1036) >> >> And now we can't even shut it down. >> >> Seems that Hbase is just too flaky to depend on for a serious system, >> we've not had this type of problem to this degree with conventional DB >> systems. Now that we are not saving that much data (we are using large >> hdfs files for that) in Hbase, we are probably going to move back to a >> conventional SQL system for our control data. We just can't afford to >> be constantly fighting problems like this. >> >> >> -- >> >> Robert Gonzalez >> >> Maxpoint Interactive >> >
