[ 
https://issues.apache.org/jira/browse/HADOOP-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035405#comment-13035405
 ] 

ramkrishna.s.vasudevan commented on HADOOP-5342:
------------------------------------------------

I would like to suggest 
Pls correct me if am wrong

namespace id is getting updated immediately after one of the disks of the 
dfs.data.dir got updated. 
instead update the namespace id after parsing all the dfs.data.dir storage 
directories 

> DataNodes do not start up because InconsistentFSStateException on just part 
> of the disks in use
> -----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5342
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5342
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.18.2
>            Reporter: Christian Kunz
>            Assignee: Hairong Kuang
>            Priority: Critical
>
> After restarting a cluster (including rebooting) the dfs got corrupted 
> because many DataNodes did not start up, running into the following exception:
> 2009-02-26 22:33:53,774 ERROR org.apache.hadoop.dfs.DataNode: 
> org.apache.hadoop.dfs.InconsistentFSStateException: Directory xxx  is in an 
> inconsistent state: version file in current directory is missing.
>       at 
> org.apache.hadoop.dfs.Storage$StorageDirectory.analyzeStorage(Storage.java:326)
>       at 
> org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:105)
>       at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:306)
>       at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:223)
>       at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:3030)
>       at 
> org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2985)
>       at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2993)
>       at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3115)
> This happens when using multiple disks with at least one previously marked as 
> read-only, such that the storage version became out-dated, but after reboot 
> it was mounted read-write, resulting in the DataNode not starting because of 
> out-dated version.
> This is a big headache. If a DataNode has multiple disks of which at least 
> one has the correct storage version then out-dated versions should not bring 
> down the DataNode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to