Re: missing VERSION files leading to failed datanodes

Ted Dunning Tue, 08 Jan 2008 10:50:50 -0800

Dhruba,

It looks from the discussion like the file was overwritten in place.


Is that good practice?  Normally the way that this sort of update is handled
is to write a temp file, move the live file to a backup, then move the temp
file to the live place.  Both moves are atomic so the worst case is that you
wind up with either a temp and a live file (ignore the temp file since it
may be incomplete) or a backup and a temp file (move temp to live since it
must be complete).


On 1/8/08 10:06 AM, "dhruba Borthakur" <[EMAIL PROTECTED]> wrote:

> Hi Joydeep,
> 
> Which version of hadoop are you running? We had earlier fixed a bug
> https://issues.apache.org/jira/browse/HADOOP-2073
> in version 0.15.
> 
> Thanks,
> dhruba
> 
> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, January 08, 2008 9:34 AM
> To: [email protected]; [email protected]
> Subject: RE: missing VERSION files leading to failed datanodes
> 
> well - at least i know why this happened. (still looking for a way to
> restore the version file).
> 
> https://issues.apache.org/jira/browse/HADOOP-2549 is causing disk full
> on one of the disks (in spite of du.reserved setting). looks like while
> starting up - the VERSION file could not be written and went missing.
> that would seem like another bug (writing a tmp file and renaming it to
> VERSION file would have prevented this mishap):
> 
> 2008-01-08 08:24:01,597 ERROR org.apache.hadoop.dfs.DataNode:
> java.io.IOException: No space left on device
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:260)
>         at
> sun.nio.cs.StreamEncoder$CharsetSE.writeBytes(StreamEncoder.java:336)
>         at
> sun.nio.cs.StreamEncoder$CharsetSE.implFlushBuffer(StreamEncoder.java:40
> 4)
>         at
> sun.nio.cs.StreamEncoder$CharsetSE.implFlush(StreamEncoder.java:408)
>         at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:152)
>         at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:213)
>         at java.io.BufferedWriter.flush(BufferedWriter.java:236)
>         at java.util.Properties.store(Properties.java:666)
>         at
> org.apache.hadoop.dfs.Storage$StorageDirectory.write(Storage.java:176)
>         at
> org.apache.hadoop.dfs.Storage$StorageDirectory.write(Storage.java:164)
>         at org.apache.hadoop.dfs.Storage.writeAll(Storage.java:510)
>         at
> org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java
> :146)
>         at
> org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:243)
> 
> 
> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED]
> Sent: Tue 1/8/2008 8:51 AM
> To: [email protected]
> Subject: missing VERSION files leading to failed datanodes
>  
> 
> 2008-01-08 08:36:20,045 ERROR org.apache.hadoop.dfs.DataNode:
> org.apache.hadoop.dfs.InconsistentFSStateException: Directory
> /var/hadoop/tmp/dfs/data is in an inconsistent state: file VERSION is
> invalid.
> 
> [EMAIL PROTECTED] data]# ssh hadoop003.sf2p cat
> /var/hadoop/tmp/dfs/data/current/VERSION
> [EMAIL PROTECTED] data]#
> 
> any idea why the VERSION file is empty? and how can i regenerate it - or
> ask the system to generate a new one without discarding all the blocks?
> 
> 
> i had previously shutdown and started dfs once (to debug a different bug
> where it's not honoring du.reserved. more on that later).
> 
> help appreciated,
> 
> Joydeep
>

Re: missing VERSION files leading to failed datanodes

Reply via email to