Hi, On Thu, 23 Dec 2010 09:30:17 +0800 li ping wrote: > It seems the exception occurs during NameNode loads the editlog. > make sure the editlog file exists. or you can debug the application to > see what's wrong.
last night I tried to fix the problem and did a big mistake. Instead of copying /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/edits and edits.new to a backup I moved them and later delete the only version hence I thought I have a copy. The good thing: The namenode starts again. The bad thing: My file system is now in an inconsistent state. Probably the only solution is to reformat the hdfs and start from scratch. Thankfully there wasn't that much data stored at the hdfs until now but I definitely have to make sure that this doesn't happens again: 1. I have set up a second dfs.name.dir which is stored at another computer (mounted by sshfs) 2. I will install a backup script similar to: http://blog.milford.io/2010/10/simple-hadoop-namenode-backup-script Do you think this should be enough to overcome such situations in the future? Any additional ideas how to make it more safe? I'm still a little bit afraid if I think about the next time I will have to reboot the server. Shouldn't a reboot safely stop and restart all Hadoop services? Is there any thing I can do to make sure that the next reboot will not cause the same problems? Thanks a lot! Björn
