Try with the 0.1.3 release candidate 5: http://people.apache.org/~stack/hbase-0.1.3-candidate-5/

It has fixes to deal with corrupted log files left over after a regionserver crash (HBASE-646, 648).

The 'corruption' was likely because when the regionserver went down, it didn't close its open log files in hdfs so a few log files of zero size were left over; the edits these Write-Ahead Logs were carrying were lost. Previous to the release candidate, we didn't deal well when we came across these empty files. Until we have appends in hdfs (HADOOP-1700 -- though a subset will be available in hadoop-0.18 that may be sufficient to our needs), data loss continues to be a fact of hbase life.

Yours,
St.Ack


Preston Price wrote:
One of the servers that acts as a hadoop and hbase node in our cluster went down. After the machine was brought back up I restarted hbase but could not interact with it. After checking the logs on all 3 of our machines I found a ton of stack traces like the following:

2008-06-26 23:07:56,683 ERROR org.apache.hadoop.hbase.HRegionServer: error opening region -ROOT-,,0
java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:178)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1434) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1411) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1400) at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1395)
        at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:254)
        at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242)
at org.apache.hadoop.hbase.HStoreFile$HbaseMapFile$HbaseReader.<init>(HStoreFile.java:554) at org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:609) at org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:382)
        at org.apache.hadoop.hbase.HStore.<init>(HStore.java:849)
        at org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:431)
at org.apache.hadoop.hbase.HRegionServer.openRegion(HRegionServer.java:1258) at org.apache.hadoop.hbase.HRegionServer$Worker.run(HRegionServer.java:1204)
        at java.lang.Thread.run(Thread.java:595)

The machine logging all these errors is not the machine that went down and I'm not sure what the recovery procedure is for this error.

I appreciate any assistance.

Thanks in advance

Preston Price
[EMAIL PROTECTED]




Reply via email to