[
https://issues.apache.org/jira/browse/HDFS-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028811#comment-13028811
]
sravankorumilli commented on HDFS-1887:
---------------------------------------
As I know this problem may come while datanode is getting formatted or by
corruption of storage file manually or through some program.Can anyone comment
on any of my proposed solutions.
1.Creating a new storage file when an EOFException is thrown in
DataStorage.isConversionNeeded method while reading LAYOUT_VERSION from the
storage file. OR
2.Sending the storage state as NOT_FORMATTED when this problem comes but this
will be a problem if it is corrupted manually or through some program then this
solution will not be appropriate as this will format the datadir.
> If DataNode gets killed after 'data.dir' is created, but before LAYOUTVERSION
> is written to the storage file. The further restarts of the DataNode, an
> EOFException will be thrown while reading the storage file.
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-1887
> URL: https://issues.apache.org/jira/browse/HDFS-1887
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: data-node
> Affects Versions: 0.20.1, 0.21.0, 0.23.0
> Environment: Linux
> Reporter: sravankorumilli
> Priority: Minor
>
> Assume DataNode gets killed after 'data.dir' is created, but before
> LAYOUTVERSION is written to the storage file. The further restarts of the
> DataNode, an EOFException will be thrown while reading the storage file. The
> DataNode cannot be restarted successfully until the 'data.dir' is deleted.
> These are the corresponding logs:-
> 2011-05-02 19:12:19,389 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.EOFException
> at java.io.RandomAccessFile.readInt(RandomAccessFile.java:725)
> at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.isConversionNeeded(DataStorage.java:203)
> at
> org.apache.hadoop.hdfs.server.common.Storage.checkConversionNeeded(Storage.java:697)
> at org.apache.hadoop.hdfs.server.common.Storage.access$000(Storage.java:62)
> at
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:476)
> at
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:116)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:336)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:260)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:237)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1440)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1393)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1407)
> at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1552)
> Our Hadoop cluster is managed by a cluster management software which tries to
> eliminate any manual intervention in setting up & managing the cluster. But
> in the above mentioned scenario, it requires manual intervention to recover
> the DataNode.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira