Re: missing VERSION files leading to failed datanodes

Ted Dunning Tue, 08 Jan 2008 09:46:42 -0800

This has bitten me as well.  It used to be that I would have two possible
partitions depending on which kind of machine I was on.  Some machines had
both partitions available, but one was much smaller.  Hadoop had a nasty
tendency to fill up the smaller partition.  Reordering the partitions in the
configuration helped because it appears that the first partition is always
selected.  The free space parameters do not appear to be honored in any
case.


The good news is that aggressive rebalancing seems to put things in the
right place.


On 1/8/08 9:34 AM, "Joydeep Sen Sarma" <[EMAIL PROTECTED]> wrote:

> well - at least i know why this happened. (still looking for a way to restore
> the version file).
> 
> https://issues.apache.org/jira/browse/HADOOP-2549 is causing disk full on one
> of the disks (in spite of du.reserved setting). looks like while starting up -
> the VERSION file could not be written and went missing. that would seem like
> another bug (writing a tmp file and renaming it to VERSION file would have
> prevented this mishap):
> 
> 2008-01-08 08:24:01,597 ERROR org.apache.hadoop.dfs.DataNode:
> java.io.IOException: No space left on device
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:260)
>         at 
> sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java:336)
>         at 
> sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(StreamEncoder.java:404)
>         at 
> sun.nio.cs.StreamEncoder$CharsetSE.implFlush(StreamEncoder.java:408)
>         at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:152)
>         at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:213)
>         at java.io.BufferedWriter.flush(BufferedWriter.java:236)
>         at java.util.Properties.store(Properties.java:666)
>         at 
> org.apache.hadoop.dfs.Storage$StorageDirectory.write(Storage.java:176)
>         at 
> org.apache.hadoop.dfs.Storage$StorageDirectory.write(Storage.java:164)
>         at org.apache.hadoop.dfs.Storage.writeAll(Storage.java:510)
>         at 
> org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:146)
>         at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:243)
> 
> 
> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED]
> Sent: Tue 1/8/2008 8:51 AM
> To: hadoop-user@lucene.apache.org
> Subject: missing VERSION files leading to failed datanodes
>  
> 
> 2008-01-08 08:36:20,045 ERROR org.apache.hadoop.dfs.DataNode:
> org.apache.hadoop.dfs.InconsistentFSStateException: Directory
> /var/hadoop/tmp/dfs/data is in an inconsistent state: file VERSION is invalid.
> 
> [EMAIL PROTECTED] data]# ssh hadoop003.sf2p cat
> /var/hadoop/tmp/dfs/data/current/VERSION
> [EMAIL PROTECTED] data]#
> 
> any idea why the VERSION file is empty? and how can i regenerate it - or ask
> the system to generate a new one without discarding all the blocks?
> 
> 
> i had previously shutdown and started dfs once (to debug a different bug where
> it's not honoring du.reserved. more on that later).
> 
> help appreciated,
> 
> Joydeep
>

Re: missing VERSION files leading to failed datanodes

Reply via email to