Re: hbase 0.2.1 failing to start

stack Mon, 29 Sep 2008 14:58:44 -0700

Dru Jensen wrote:

St.Ack.  Thanks for your response.
I will enable DEBUG and rerun the MR processes to try and reproduce this.
Hadoop is reporting everything is healthy using fsck.
This is a test platform so the data is not critical but my confidenceis shaken (I sound like a day trader).

Understood.

Questions:
1. Is there anything specific I should be looking for when I enableDEBUG?

Thats a bit of a tough question. Requires study to be able tointerpret. In short, look at lines before ERRORs and WARNINGs foranything that might explain why the ERROR or WARNING exception(NotServingRegionExceptions are part of 'normal' operation -- its theother exception-types you are interested in).

2. Does "bad" mean I cannot recover and i need to delete/hbase.rootdir and start over?

Not if fsck says all is ok. I said 'bad' because I thought you wouldhave to do the above.

Any events on your cluster that might have effected HDFS? A tsunamihit? Or in your case, it wouldn't take much since replication was setto one -- did a host crash?

3. Does HBase depend on replication for normal operation? In otherwords, will it work without replication enabled?

It'll work fine without replication until you lose data. Thereafter,it'll be hobbled by files with holes in them -- where the holes areblocks that sat on the downed server.

Would suggest running with replication of 3 unless you have very goodreason -- and insurance against failure -- for doing otherwise.


Go easy Dru,
St.Ack

On Sep 29, 2008, at 1:27 PM, stack wrote:
Dru Jensen wrote:
HBase was not responding to Thrift requests so I tried to restartbut it still looks frozen. I am seeing several error messages inthe hmaster logs after I attempted to restart hbase:
2008-09-29 12:55:23,744 ERRORorg.apache.hadoop.hbase.regionserver.HRegionServer: error openingregion {table},{key},1222453917858
java.io.IOException: Premeture EOF from inputStream
   at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100)
atorg.apache.hadoop.dfs.DFSClient$BlockReader.readChunk(DFSClient.java:967)atorg.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:236)
Enable DEBUG and it might tell you what it was trying to open at timeof the exception.
and:
2008-09-29 12:58:50,067 ERRORorg.apache.hadoop.hbase.regionserver.HRegionServer: error openingregion {table},{key},1222453917858
java.io.IOException: Could not obtain block: blk_-2905695662732817278
This is bad. What happens if you run './bin/hadoop fsck/hbase.rootdir'? Your replication is one. Means if any hdfs hiccup,data is lost. You might putting replication back to the default?
St.Ack

Re: hbase 0.2.1 failing to start

Reply via email to