[
https://issues.apache.org/jira/browse/HDFS-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906894#action_12906894
]
Tanping Wang commented on HDFS-1365:
------------------------------------
ClusterID is the birthmark of cluster. It is a globally unique ID created when
a cluster is created. HDFS cluster is initialized when the very first
namespace volume of the cluster is created. As part of formatting a NN with the
"-newCluster" option, it will generate a unique ClusterID and a unique
BlockPoolID, which are persisted on the namenode. Subsequent NN must be
given the same ClusterID during its format to be in the same cluster. Each DN
discovers the ClusterID when it registers and from then on "sticks" to this
cluster. If at any point a NN or a DN tries to join another cluster, the DNs
or the NNs in that cluster will reject registration. Why do we need ClusterID?
We cannot solve it merely with global BlockPoolIDs or NamespaceIDs - In
federation setup, a DN talks to multiple NNs. If a DN is accidentally moved to
another cluster, the DN continues to keep its old blocks and creates new block
pools for the NNs in the new cluster. The NamespaceID that previously prevented
such moves will not work in this case. The above content can be also found in
the design doc, high-level-design.pdf of HDFS-1052.
> HDFS federation: propose ClusterID and BlockPoolID format
> ---------------------------------------------------------
>
> Key: HDFS-1365
> URL: https://issues.apache.org/jira/browse/HDFS-1365
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Tanping Wang
> Assignee: Tanping Wang
> Fix For: Federation Branch
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.