Barry:

From the below, looks like an issue in HDFS. If regionserver is having issues talking to HDFS, it shuts itself down.

Tell us more. Are there other, heavy-duty processes running on the same servers hosting datanodes and regionservers? Enable DEBUG on your cluster and makes sure you've set your ulimit file descriptors up from default. See the FAQ in wiki for how to do both.

Thanks,
St.Ack

Barry Haddow wrote:
Hi

I recently set up a small hbase cluster (v 0.18) running on top of hadoop v.0.18.1. However I'm observing that the region servers spontaneously shut themselves down, usually with an UnknownScannerException. For instance, this weekend, I discovered that all four had shut down, with messages like the following in the logs:

2008-09-29 05:50:17,203 INFO org.apache.hadoop.dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink 129.215.197.39:50010 2008-09-29 05:50:17,203 INFO org.apache.hadoop.dfs.DFSClient: Abandoning block blk_-5829206400135277905_3045 2008-09-29 07:29:16,552 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_CALL_SERVER_STARTUP 2008-09-29 07:46:35,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 60020, call next(-1347145425990165691) from 129.215.197.39:6999: error: org.apache.hadoop.hbase.UnknownScannerException: Name: -1347145425990165691


The underlying hdfs seems fine - fsck reports the hbase directory as healthy. After a restart hbase seems fine too, but surely the regionservers should stay up once they're started,

Any suggestions?

regards
Barry


Reply via email to