[
https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109793#comment-14109793
]
Haohui Mai commented on HDFS-6376:
----------------------------------
bq. My patch took the non-invasive approach as I am not familiar with the code
base and all of the affected components. I think a better long term
implementation would be to specify the clusters (cluster1, cluster2) and their
associated nameservices and specify which cluster is "this" cluster.
There are no fundamental differences between having an exclude and include
lists for the clusters. It is somewhat easier to predict what NNs that the DNs
are going to report just based on the configuration. I agree that having the
ability to specify what a cluster is will simplify the configuration.
bq. I tried a change in DFSUtil in my patch6 (see below). I had to back it out
as it caused problems. I have had to use -ns in the admin commands and am use
to using it now. My point here is that if you have a complex configuration,
then you may need to be more specific in the commands that you execute. I think
its fair to force the user to specify the -ns argument.
Agree. I think it is fair to require the users to specify the nameservice in
this complex settings.
> Distcp data between two HA clusters requires another configuration
> ------------------------------------------------------------------
>
> Key: HDFS-6376
> URL: https://issues.apache.org/jira/browse/HDFS-6376
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, federation, hdfs-client
> Affects Versions: 2.2.0, 2.3.0, 2.4.0
> Environment: Hadoop 2.3.0
> Reporter: Dave Marion
> Assignee: Dave Marion
> Fix For: 3.0.0
>
> Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch,
> HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch,
> HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch,
> HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch,
> HDFS-6376.009.patch
>
>
> User has to create a third set of configuration files for distcp when
> transferring data between two HA clusters.
> Consider the scenario in [1]. You cannot put all of the required properties
> in core-site.xml and hdfs-site.xml for the client to resolve the location of
> both active namenodes. If you do, then the datanodes from cluster A may join
> cluster B. I can not find a configuration option that tells the datanodes to
> federate blocks for only one of the clusters in the configuration.
> [1]
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E
--
This message was sent by Atlassian JIRA
(v6.2#6252)