The Region Server logs also shows the same -ROOT- Region not online error. On Mon, May 23, 2011 at 1:10 PM, Bill Graham <[email protected]> wrote:
> Is there anything meaningful in the RS logs? I've seen situations like this > where a RS is failing to start due to issues reading the WAL. If this is > the > case it would list which WAL is problematic, which is zero-length in my > experience, so I delete it from HDFS and things start up. > > > On Mon, May 23, 2011 at 9:16 AM, Himanish Kushary <[email protected] > >wrote: > > > Both the Master and hbck command prints > > > > org.apache.hadoop.hbase.NotServingRegionException: > > org.apache.hadoop.hbase.NotServingRegionException: Region is not online: > > -ROOT-,,0 > > > > After the master thread exits due to the Heap Space error the hbck > command > > throws: > > > > org.apache.hadoop.hbase.MasterNotRunningException > > > > Is there anyway to fix this kind of issue.We are keeping the datanodes up > > to > > see whether the under replicated blocks may be recovered.Does improper > > shutdown of the hadoop/hbase services cause this kind of issues? What > > happens in case of disaster recovery situation, how are those situaltions > > handled ? > > > > Thanks > > > > > > On Mon, May 23, 2011 at 11:36 AM, Stack <[email protected]> wrote: > > > > > What does hbase hbck say? (http://hbase.apache.org/book.html#hbck). > > > > > > What does the master log have in it? Anything of interest. > > > > > > St.Ack > > > > > > On Mon, May 23, 2011 at 7:53 AM, Himanish Kushary <[email protected]> > > > wrote: > > > > Pressed the send button too soon... > > > > > > > > Also here is the output from hadoop fsck > > > > > > > > *Status: HEALTHY* > > > > * Total size: 37678848280 B* > > > > * Total dirs: 941* > > > > * Total files: 902 (Files currently being written: 1)* > > > > * Total blocks (validated): 1141 (avg. block size 33022654 B) (Total > > open > > > > file blocks (not validated): 1)* > > > > * Minimally replicated blocks: 1141 (100.0 %)* > > > > * Over-replicated blocks: 0 (0.0 %)* > > > > * Under-replicated blocks: 906 (79.40403 %)* > > > > * Mis-replicated blocks: 0 (0.0 %)* > > > > * Default replication factor: 2* > > > > * Average block replication: 2.0* > > > > * Corrupt blocks: 0* > > > > * Missing replicas: 1886 (82.646805 %)* > > > > * Number of data-nodes: 2* > > > > * Number of racks: 1* > > > > *FSCK ended at Mon May 23 10:51:13 EDT 2011 in 257 milliseconds* > > > > * > > > > * > > > > * > > > > * > > > > *The filesystem under path '/' is HEALTHY* > > > > > > > > > > > > Could anybody please help on how to recover from this scenario . > > > > > > > > Thanks > > > > > > > > > > > > On Mon, May 23, 2011 at 10:50 AM, Himanish Kushary < > [email protected] > > > >wrote: > > > > > > > >> Hi, > > > >> > > > >> Our hbase/hadoop servers machines were shutdown without bringing the > > > hadoop > > > >> and hbase services down properly.Now when we try to bring up hbase > we > > > get > > > >> the following error in the master log: > > > >> > > > >> org.apache.hadoop.hbase.NotServingRegionException: Region is not > > online: > > > >> -ROOT-,,0 > > > >> > > > >> Hadoop services (namenode,jobtracker,datanode etc) have come up > > properly > > > >> and we are able to see the files in HDFS. But HBase Master keeps on > > > throwing > > > >> this exception and then finally throws a Java Heap Space error. > > > >> > > > >> Note: We have two datanodes, replication set to 2 and around 900 > > blocks > > > are > > > >> shown as under-replicated. > > > >> > > > >> --------------------------------- > > > >> Thanks & Regards > > > >> Himanish > > > >> > > > > > > > > > > > > > > > > -- > > > > Thanks & Regards > > > > Himanish > > > > > > > > > > > > > > > -- > > Thanks & Regards > > Himanish > > > -- Thanks & Regards Himanish
