All this aside, you really shouldn't have to "safely" stop all the Hadoop services when you reboot any of your servers. Hadoop should be able to survive a crash of any of the daemons. Any circumstance in which Hadoop currently corrupts the edits log or fsimage is a serious bug, and should be reported via JIRA.
-- Aaron T. Myers Software Engineer, Cloudera On Thu, Dec 23, 2010 at 7:29 AM, rahul patodi <[email protected]> wrote: > Hi, > If you want to reboot the server: > 1. stop mapred > 2. stop dfs > the reboot > when you again want to restart hadoop > firstly start dfs then mepred. > > -- > *Regards*, > Rahul Patodi > Software Engineer, > Impetus Infotech (India) Pvt Ltd, > www.impetus.com > Mob:09907074413 > > > On Thu, Dec 23, 2010 at 6:15 PM, li ping <[email protected]> wrote: > > > As far as I know, setup a backup namenode dir is enough. > > > > I haven't use the hadoop in a production environment. So, I can't tell > you > > what would be right way to reboot the server. > > > > On Thu, Dec 23, 2010 at 6:50 PM, Bjoern Schiessle <[email protected] > > >wrote: > > > > > Hi, > > > > > > On Thu, 23 Dec 2010 09:30:17 +0800 li ping wrote: > > > > It seems the exception occurs during NameNode loads the editlog. > > > > make sure the editlog file exists. or you can debug the application > to > > > > see what's wrong. > > > > > > last night I tried to fix the problem and did a big mistake. Instead of > > > copying /var/lib/hadoop-0.20/cache/hadoop/dfs/name/current/edits and > > > edits.new to a backup I moved them and later delete the only version > > > hence I thought I have a copy. > > > > > > The good thing: The namenode starts again. > > > The bad thing: My file system is now in an inconsistent state. > > > > > > Probably the only solution is to reformat the hdfs and start from > > > scratch. Thankfully there wasn't that much data stored at the hdfs > until > > > now but I definitely have to make sure that this doesn't happens again: > > > > > > 1. I have set up a second dfs.name.dir which is stored at another > > > computer (mounted by sshfs) > > > 2. I will install a backup script similar to: > > > http://blog.milford.io/2010/10/simple-hadoop-namenode-backup-script > > > > > > Do you think this should be enough to overcome such situations in the > > > future? Any additional ideas how to make it more safe? > > > > > > I'm still a little bit afraid if I think about the next time I will > have > > > to reboot the server. Shouldn't a reboot safely stop and restart all > > > Hadoop services? Is there any thing I can do to make sure that the next > > > reboot will not cause the same problems? > > > > > > Thanks a lot! > > > Björn > > > > > > > > > > > > > > > -- > > -----李平 > > >
