Just had an HDFS/HBase instance where all the slave/regionservers processes crashed, but the namenode stayed up. I did proper shutdown of the namenode
After bringing Hadoop back up the namenode is stuck in safe mode. Fsck shows 235 corrupt/missing blocks out of 117280 Blocks. All the slaves are doing DataBlockScanner: Verification succeeded. As far as I can tell there are no errors in the datanodes. Can I expect it to self-heal? Or do I need to do something to help it along? Anyway to tell how long it will take to recover if I do have to just wait? Other than the verification messages on the datanodes, the namenode fsck numbers are not changing and the namenode log continues to say: The ratio of reported blocks 0.9980 has not reached the threshold 0.9990. Safe mode will be turned off automatically. The ratio has not changed for over an hour now. If you happen to know the answer, please get back to me right away by email or on #hadoop IRC as I'm trying to figure it out now... Thanks! __________________ Robert J Berger - CTO Runa Inc. +1 408-838-8896 http://blog.ibd.com
