[
https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108157#comment-14108157
]
Dave Marion commented on HDFS-6376:
-----------------------------------
bq. Currently DFSUtil#getOnlyNameServiceIdOrNull returns null if there are more
than two nameservices specified. There are a couple of places called this
method, and looks like DFSHAAdmin#resolveTarget may hit some issue if no -ns
option is specified by HAAdmin. Thus I think we may also need to add the
exclude logic in DFSUtil#getOnlyNameServiceIdOrNull. And we need to add more
tests for this new feature, e.g., to cover its usage in DFSHAAdmin.
I tried a change in DFSUtil in my patch6 (see below). I had to back it out as
it caused problems. I have had to use -ns in the admin commands and am use to
using it now. My point here is that if you have a complex configuration, then
you may need to be more specific in the commands that you execute. I think its
fair to force the user to specify the -ns argument.
{code}
+++
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
@@ -527,7 +527,12 @@ public static String path2String(final Object path) {
* @return collection of nameservice Ids, or null if not specified
*/
public static Collection<String> getNameServiceIds(Configuration conf) {
- return conf.getTrimmedStringCollection(DFS_NAMESERVICES);
+ Collection<String> nameServices =
+ conf.getTrimmedStringCollection(DFSConfigKeys.DFS_NAMESERVICES);
+ Collection<String> nameServiceExcludes =
+
conf.getTrimmedStringCollection(DFSConfigKeys.DFS_NAMESERVICE_CLUSTER_EXCLUDES_KEY);
+ nameServices.removeAll(nameServiceExcludes);
+ return nameServices;
}
{code}
> Distcp data between two HA clusters requires another configuration
> ------------------------------------------------------------------
>
> Key: HDFS-6376
> URL: https://issues.apache.org/jira/browse/HDFS-6376
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode, federation, hdfs-client
> Affects Versions: 2.2.0, 2.3.0, 2.4.0
> Environment: Hadoop 2.3.0
> Reporter: Dave Marion
> Assignee: Dave Marion
> Fix For: 3.0.0
>
> Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch,
> HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch,
> HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch,
> HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch
>
>
> User has to create a third set of configuration files for distcp when
> transferring data between two HA clusters.
> Consider the scenario in [1]. You cannot put all of the required properties
> in core-site.xml and hdfs-site.xml for the client to resolve the location of
> both active namenodes. If you do, then the datanodes from cluster A may join
> cluster B. I can not find a configuration option that tells the datanodes to
> federate blocks for only one of the clusters in the configuration.
> [1]
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E
--
This message was sent by Atlassian JIRA
(v6.2#6252)