[jira] Commented: (HDFS-107) Data-nodes should be formatted when the name-node is formatted.

Konstantin Shvachko (JIRA) Sun, 23 Jan 2011 12:17:23 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985402#action_12985402
 ]


Konstantin Shvachko commented on HDFS-107:
------------------------------------------

In the current scenario data-nodes cannot mistakenly connect to another 
name-node as they will have different namespaceIds, and therefore blocks cannot 
be invalidated. Approach (2) will break this, but only in case of cTime=0.

> Data-nodes should be formatted when the name-node is formatted.
> ---------------------------------------------------------------
>
>                 Key: HDFS-107
>                 URL: https://issues.apache.org/jira/browse/HDFS-107
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Konstantin Shvachko
>
> The upgrade feature HADOOP-702 requires data-nodes to store persistently the 
> namespaceID 
> in their version files and verify during startup that it matches the one 
> stored on the name-node.
> When the name-node reformats it generates a new namespaceID.
> Now if the cluster starts with the reformatted name-node, and not reformatted 
> data-nodes
> the data-nodes will fail with
> java.io.IOException: Incompatible namespaceIDs ...
> Data-nodes should be reformatted whenever the name-node is. I see 2 
> approaches here:
> 1) In order to reformat the cluster we call "start-dfs -format" or make a 
> special script "format-dfs".
> This would format the cluster components all together. The question is 
> whether it should start
> the cluster after formatting?
> 2) Format the name-node only. When data-nodes connect to the name-node it 
> will tell them to
> format their storage directories if it sees that the namespace is empty and 
> its cTime=0.
> The drawback of this approach is that we can loose blocks of a data-node from 
> another cluster
> if it connects by mistake to the empty name-node.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-107) Data-nodes should be formatted when the name-node is formatted.

Reply via email to