[
https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108155#comment-14108155
]
Dave Marion commented on HDFS-6376:
-----------------------------------
bq. I think it might make more sense to explicitly specify the name service
that the DNs should report to. Since the changes are trivial, I'll provide
another patch.
I agree. I think there is a deficiency in the configuration properties. With
federation, an HDFS cluster is a set of nameservices (ns1, ns2, ns3, ns4).
However, I don't think you can define an alias for the overall cluster such
that cluster1 contains ns1 and ns2, and cluster2 contains n3 and ns4. In
hdfs-site.xml, all of the nameservices are listed and the DN tries to connect
to all of them, with the downside that the first one that responds to the DN
assigns the cluster id. My patch took the non-invasive approach as I am not
familiar with the code base and all of the affected components. I think a
better long term implementation would be to specify the clusters (cluster1,
cluster2) and their associated nameservices and specify which cluster is "this"
cluster.
> Distcp data between two HA clusters requires another configuration
> ------------------------------------------------------------------
>
> Key: HDFS-6376
> URL: https://issues.apache.org/jira/browse/HDFS-6376
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, federation, hdfs-client
> Affects Versions: 2.2.0, 2.3.0, 2.4.0
> Environment: Hadoop 2.3.0
> Reporter: Dave Marion
> Assignee: Dave Marion
> Fix For: 3.0.0
>
> Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch,
> HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch,
> HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch,
> HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch
>
>
> User has to create a third set of configuration files for distcp when
> transferring data between two HA clusters.
> Consider the scenario in [1]. You cannot put all of the required properties
> in core-site.xml and hdfs-site.xml for the client to resolve the location of
> both active namenodes. If you do, then the datanodes from cluster A may join
> cluster B. I can not find a configuration option that tells the datanodes to
> federate blocks for only one of the clusters in the configuration.
> [1]
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E
--
This message was sent by Atlassian JIRA
(v6.2#6252)