[ 
https://issues.apache.org/jira/browse/HDFS-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028811#comment-13028811
 ] 

sravankorumilli commented on HDFS-1887:
---------------------------------------

As I know this problem may come while datanode is getting formatted or by 
corruption of storage file manually or through some program.Can anyone comment 
on any of my proposed solutions.
1.Creating a new storage file when an EOFException is thrown in 
DataStorage.isConversionNeeded method while reading LAYOUT_VERSION from the 
storage file. OR
2.Sending the storage state as NOT_FORMATTED when this problem comes but this 
will be a problem if it is corrupted manually or through some program then this 
solution will not be appropriate as this will format the datadir.

> If DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION 
> is written to the storage file. The further restarts of the DataNode, an 
> EOFException will be thrown while reading the storage file. 
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1887
>                 URL: https://issues.apache.org/jira/browse/HDFS-1887
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20.1, 0.21.0, 0.23.0
>         Environment: Linux
>            Reporter: sravankorumilli
>            Priority: Minor
>
> Assume DataNode gets killed after 'data.dir' is created, but before 
> LAYOUTVERSION is written to the storage file. The further restarts of the 
> DataNode, an EOFException will be thrown while reading the storage file. The 
> DataNode cannot be restarted successfully until the 'data.dir' is deleted.
> These are the corresponding logs:-
> 2011-05-02 19:12:19,389 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.EOFException
> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.isConversionNeeded(DataStorage.java:203)
> at 
> org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:697)
> at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:62)
> at 
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:476)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:116)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:336)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:260)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:237)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1440)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1393)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1407)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1552)
> Our Hadoop cluster is managed by a cluster management software which tries to 
> eliminate any manual intervention in setting up & managing the cluster. But 
> in the above mentioned scenario, it requires manual intervention to recover 
> the DataNode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to