[ 
https://issues.apache.org/jira/browse/HDFS-6376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108093#comment-14108093
 ] 

Jing Zhao commented on HDFS-6376:
---------------------------------

Comments on the new 000 patch:
# In the following code, when {{parentNameServices}} is empty, there is no need 
to compare {{parentNameServices}} and {{availableNameServices}}.
{code}
+    Collection<String> parentNameServices = conf.getTrimmedStringCollection
+            (DFSConfigKeys.DFS_INTERNAL_NAMESERVICES_KEY);
+
+    if (parentNameServices.isEmpty()) {
+      parentNameServices = conf.getTrimmedStringCollection
+              (DFSConfigKeys.DFS_NAMESERVICES);
+    }
+
+    Set<String> availableNameServices = Sets.newHashSet(conf
+            .getTrimmedStringCollection(DFSConfigKeys.DFS_NAMESERVICES));
+    for (String nsId : parentNameServices) {
+      if (!availableNameServices.contains(nsId)) {
+        throw new IOException("Unknown nameservice: " + nsId);
+      }
+    }
{code}
# Please generally describe your system test results (when applying the 000 
patch for distcp between two HA clusters).
# Some code needs to fix indent.

bq. I think it might make more sense to explicitly specify the name service 
that the DNs should report to. 
I do not have a strong feeling between the "include" and the "exclude" logic 
here. Actually the original exclusive logic looks even simpler to me, since it 
does not need to handle the incompatibility issue (i.e., no extra handling when 
the new configuration property is not specified).

> Distcp data between two HA clusters requires another configuration
> ------------------------------------------------------------------
>
>                 Key: HDFS-6376
>                 URL: https://issues.apache.org/jira/browse/HDFS-6376
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, federation, hdfs-client
>    Affects Versions: 2.2.0, 2.3.0, 2.4.0
>         Environment: Hadoop 2.3.0
>            Reporter: Dave Marion
>            Assignee: Dave Marion
>             Fix For: 3.0.0
>
>         Attachments: HDFS-6376-2.patch, HDFS-6376-3-branch-2.4.patch, 
> HDFS-6376-4-branch-2.4.patch, HDFS-6376-5-trunk.patch, 
> HDFS-6376-6-trunk.patch, HDFS-6376-7-trunk.patch, HDFS-6376-branch-2.4.patch, 
> HDFS-6376-patch-1.patch, HDFS-6376.000.patch, HDFS-6376.008.patch
>
>
> User has to create a third set of configuration files for distcp when 
> transferring data between two HA clusters.
> Consider the scenario in [1]. You cannot put all of the required properties 
> in core-site.xml and hdfs-site.xml for the client to resolve the location of 
> both active namenodes. If you do, then the datanodes from cluster A may join 
> cluster B. I can not find a configuration option that tells the datanodes to 
> federate blocks for only one of the clusters in the configuration.
> [1] 
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201404.mbox/%3CBAY172-W2133964E0C283968C161DD1520%40phx.gbl%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to