[
https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105586#comment-14105586
]
Dave Marion commented on HDFS-6376:
-----------------------------------
Thanks for reviewing this. I thought this was dead and that I would forever
have to patch Hadoop for our application. Out of curiosity, have others started
running into this issue?
bq. have you tested your patch for distcp between real clusters?
Yes. I have been running a version of this patch for about 2 months on a test
cluster. We are using Hadoop 2 so the patch that I am applying is a little
different. My Hadoop 2 patch also includes a change in the dfsclusterhealth.jsp
file so that only NameNodes in "this" cluster are shown. I could not find the
same jsp file in the Hadoop 3 source. Generally speaking, I think I have fixed
all locations in the code that need to be fixed, but I could be missing
something that I don't know about. As you can see from the patch history, I
thought I had to make a change DFSUtil, but it broke some things and I had to
revert those changes.
bq. It will be great if you can generally mention how you patch works for both
secured and insecure HA clusters.
We are not using secured HA, it has not been tested in that manner
bq. Another nit is that we need to fix indents in the new unit test.
I'm happy to fix.
bq. Maybe we can rename the new configuration from
"dfs.nameservice.cluster.excludes" to something like
"dfs.nameservices.cluster.outside"
I have no issues with changing the name. In my situation I have multiple HDFS
nameservices defined in hdfs-site.xml and I want to explicitly state which ones
are not part of "this" cluster. Exclude seemed like a good term for that.
> Distcp data between two HA clusters requires another configuration
> ------------------------------------------------------------------
>
> Key: HDFS-6376
> URL: https://issues.apache.org/jira/browse/HDFS-6376
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, federation, hdfs-client
> Affects Versions: 2.3.0, 2.4.0
> Environment: Hadoop 2.3.0
> Reporter: Dave Marion
> Assignee: Dave Marion
> Fix For: 3.0.0
>
> Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch,
> HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch,
> HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch,
> HDFS-6376-patch-1.patch
>
>
> User has to create a third set of configuration files for distcp when
> transferring data between two HA clusters.
> Consider the scenario in [1]. You cannot put all of the required properties
> in core-site.xml and hdfs-site.xml for the client to resolve the location of
> both active namenodes. If you do, then the datanodes from cluster A may join
> cluster B. I can not find a configuration option that tells the datanodes to
> federate blocks for only one of the clusters in the configuration.
> [1]
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E
--
This message was sent by Atlassian JIRA
(v6.2#6252)