[ 
https://issues.apache.org/jira/browse/HDFS-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906894#action_12906894
 ] 

Tanping Wang commented on HDFS-1365:
------------------------------------

ClusterID is the birthmark of cluster.  It is a globally unique ID created when 
a cluster is created.  HDFS cluster is initialized when the very first 
namespace volume of the cluster is created. As part of formatting a NN with the 
"-newCluster" option, it will generate a unique ClusterID and a unique 
BlockPoolID, which are persisted on the namenode.    Subsequent NN must be 
given the same ClusterID during its format to be in the same cluster.  Each DN 
discovers the ClusterID when it registers and from then on "sticks" to this 
cluster.   If at any point a NN or a DN tries to join another cluster, the DNs 
or the NNs in that cluster will reject registration.  Why do we need ClusterID? 
 We cannot solve it merely with global BlockPoolIDs or NamespaceIDs - In 
federation setup, a DN talks to multiple NNs. If a DN is accidentally moved to 
another cluster, the DN continues to keep its old blocks and creates new block 
pools for the NNs in the new cluster. The NamespaceID that previously prevented 
such moves will not work in this case.  The above content can be also found in 
the design doc, high-level-design.pdf of HDFS-1052.

> HDFS federation: propose ClusterID and BlockPoolID format
> ---------------------------------------------------------
>
>                 Key: HDFS-1365
>                 URL: https://issues.apache.org/jira/browse/HDFS-1365
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tanping Wang
>            Assignee: Tanping Wang
>             Fix For: Federation Branch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to