Hi Ankur, We have had this kind of failure reported by others earlier on this list. This might help you
http://markmail.org/message/u6l6lwus33oeivcd Thanks, Lohit ----- Original Message ---- From: ANKUR GOEL <[EMAIL PROTECTED]> To: [email protected]; [EMAIL PROTECTED] Sent: Thursday, November 13, 2008 4:34:15 AM Subject: Namenode Failure Hi Folks, We have been running hadoop-0.17.2 release on a 50 machine cluster and we recently experience a namenode failure because of disk becoming full. The node is unable to start-up now and throws the following exception 2008-11-13 06:41:18,618 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000 2008-11-13 06:41:18,619 ERROR org.apache.hadoop.dfs.NameNode: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at org.apache.hadoop.io.UTF8.readFields(UTF8.java:106) at org.apache.hadoop.io.ArrayWritable.readFields(ArrayWritable.java:90) at org.apache.hadoop.dfs.FSEditLog.loadFSEdits(FSEditLog.java:599) at org.apache.hadoop.dfs.FSImage.loadFSEdits(FSImage.java:766) at org.apache.hadoop.dfs.FSImage.loadFSImage(FSImage.java:640) at org.apache.hadoop.dfs.FSImage.doUpgrade(FSImage.java:250) at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:217) at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:80) at org.apache.hadoop.dfs.FSNamesystem.initialize(FSNamesystem.java:274) at org.apache.hadoop.dfs.FSNamesystem.<init>(FSNamesystem.java:255) at org.apache.hadoop.dfs.NameNode.initialize(NameNode.java:133) What is the best way to recover this failure with minimal data loss ? I could not find instructions on wiki or anywhere else for release 0.17.2 to do recovery using files from secondary namenode ? Any help is greatly appreciated. Thanks -Ankur
