well - at least i know why this happened. (still looking for a way to restore the version file).
https://issues.apache.org/jira/browse/HADOOP-2549 is causing disk full on one of the disks (in spite of du.reserved setting). looks like while starting up - the VERSION file could not be written and went missing. that would seem like another bug (writing a tmp file and renaming it to VERSION file would have prevented this mishap): 2008-01-08 08:24:01,597 ERROR org.apache.hadoop.dfs.DataNode: java.io.IOException: No space left on device at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:260) at sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java:336) at sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(StreamEncoder.java:404) at sun.nio.cs.StreamEncoder$CharsetSE.implFlush(StreamEncoder.java:408) at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:152) at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:213) at java.io.BufferedWriter.flush(BufferedWriter.java:236) at java.util.Properties.store(Properties.java:666) at org.apache.hadoop.dfs.Storage$StorageDirectory.write(Storage.java:176) at org.apache.hadoop.dfs.Storage$StorageDirectory.write(Storage.java:164) at org.apache.hadoop.dfs.Storage.writeAll(Storage.java:510) at org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:146) at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:243) -----Original Message----- From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED] Sent: Tue 1/8/2008 8:51 AM To: hadoop-user@lucene.apache.org Subject: missing VERSION files leading to failed datanodes 2008-01-08 08:36:20,045 ERROR org.apache.hadoop.dfs.DataNode: org.apache.hadoop.dfs.InconsistentFSStateException: Directory /var/hadoop/tmp/dfs/data is in an inconsistent state: file VERSION is invalid. [EMAIL PROTECTED] data]# ssh hadoop003.sf2p cat /var/hadoop/tmp/dfs/data/current/VERSION [EMAIL PROTECTED] data]# any idea why the VERSION file is empty? and how can i regenerate it - or ask the system to generate a new one without discarding all the blocks? i had previously shutdown and started dfs once (to debug a different bug where it's not honoring du.reserved. more on that later). help appreciated, Joydeep