Hi.

After rebooting the NameNode server, I found out the NameNode doesn't start
anymore.

The logs contained this error:
"FSNamesystem initialization failed"


I suspected filesystem corruption, so I tried to recover from
SecondaryNameNode. Problem is, it was completely empty!

I had an issue that might have caused this - the root mount has run out of
space. But, both the NameNode and the SecondaryNameNode directories were on
another mount point with plenty of space there - so it's very strange that
they were impacted in any way.

Perhaps the logs, which were located on root mount and as a result, could
not be written, have caused this?


To get back HDFS running, i had to format the HDFS (including manually
erasing the files from DataNodes). While this reasonable in test environment
- production-wise it would be very bad.

Any idea why it happened, and what can be done to prevent it in the future?
I'm using the stable 0.18.3 version of Hadoop.

Thanks in advance!

Reply via email to