St.Ack.  Thanks for your response.

I will enable DEBUG and rerun the MR processes to try and reproduce this.
Hadoop is reporting everything is healthy using fsck.
This is a test platform so the data is not critical but my confidence is shaken (I sound like a day trader).

Questions:
1. Is there anything specific I should be looking for when I enable DEBUG?

2. Does "bad" mean I cannot recover and i need to delete / hbase.rootdir and start over?

3. Does HBase depend on replication for normal operation? In other words, will it work without replication enabled?



On Sep 29, 2008, at 1:27 PM, stack wrote:

Dru Jensen wrote:
HBase was not responding to Thrift requests so I tried to restart but it still looks frozen. I am seeing several error messages in the hmaster logs after I attempted to restart hbase:

2008-09-29 12:55:23,744 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: error opening region {table},{key},1222453917858
java.io.IOException: Premeture EOF from inputStream
   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100)
at org.apache.hadoop.dfs.DFSClient $BlockReader.readChunk(DFSClient.java:967) at org .apache .hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:236)

Enable DEBUG and it might tell you what it was trying to open at time of the exception.


and:

2008-09-29 12:58:50,067 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: error opening region {table},{key},1222453917858
java.io.IOException: Could not obtain block: blk_-2905695662732817278

This is bad. What happens if you run './bin/hadoop fsck / hbase.rootdir'? Your replication is one. Means if any hdfs hiccup, data is lost. You might putting replication back to the default?



St.Ack


Reply via email to