[ 
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811221#comment-17811221
 ] 

xiaojunxiang edited comment on HDFS-17356 at 1/26/24 1:42 PM:
--------------------------------------------------------------

[~hexiaoqiao] Yes, if server side configure as "dfs.nameservices=ns1,ns2" 
client side configure as "dfs.nameservices=nsRouter", it won't cause problem.  

But in ambari, the server and client use the same configuration, without 
resolving this conflict, we would have changed the deployment pattern of each 
machine in the cluster, which would have been too costly for the customer 
company to accept. 

So we want uniformly configure  as "dfs.nameservices=ns1,ns2,nsRouter", and 
confugure "dfs.federation.router.ns.name=nsRouter" to tell server: "we should 
ignore nsRouter  that nsRouter  nameservice".

In this way, the configuration of the server and the client can be unified and 
the conflicts on the server may be resolved.


was (Author: JIRAUSER300087):
[~hexiaoqiao] Yes, if server side configure as dfs.nameservices=ns1,ns2, client 
side configure as dfs.nameservices=nsRouter, it won't cause problem.  

But in ambari, the server and client use the same configuration, without 
resolving this conflict, we would have changed the deployment pattern of each 
machine in the cluster, which would have been too costly for the customer 
company to accept. 

So we want uniformly configure  as dfs.nameservices=ns1,ns2,nsRouter, and 
confugure dfs.federation.router.ns.name=nsRouter to tell server: "we should 
ignore nsRouter  that nsRouter  nameservice".

In this way, the configuration of the server and the client can be unified and 
the conflicts on the server may be resolved.

> RBF: Add Configuration dfs.federation.router.ns.name Optimization
> -----------------------------------------------------------------
>
>                 Key: HDFS-17356
>                 URL: https://issues.apache.org/jira/browse/HDFS-17356
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: dfs, rbf
>            Reporter: wangzhihui
>            Priority: Minor
>         Attachments: screenshot-1.png, screenshot-2.png
>
>
>     When enabling RBF federation in HDFS, when the HDFS server and RBFClient 
> share the same configuration and the HDFS server (NameNode、ZKFC) and 
> RBFClient are on the same node, the following exception occurs, causing 
> NameNode to fail to start; The reason is that the NS of the Router service 
> has been added to the dfs.nameservices list. When NameNode starts, it obtains 
> the NS that the current node belongs to. However, it is found that there are 
> multiple NS that cannot be recognized and cannot pass the verification of 
> existing logic, ultimately resulting in NameNode startup failure. Currently, 
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient 
> and NameNode. However, grouping configuration is not conducive to our unified 
> management of cluster configuration. Therefore, we propose a new solution to 
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: 
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system 
> started
> 2023-10-30 15:53:24,868 ERROR 
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple 
> addresses that match local node's address. Please configure the system with 
> dfs.nameservice.id and dfs.ha.namenode.id
>         at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
>         at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
>         at 
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1005)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:995)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has 
> multiple addresses that match local node's address. Please configure the 
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
> SHUTDOWN_MSG: {code}
>  
> hdfs-site.xml
> {code:java}
> // code placeholder
> <property>
>   <name>dfs.nameservices</name>
>   <value>mycluster1,mycluster2,ns-fed</value>
> </property><property>
>   <name>dfs.ha.namenodes.ns-fed</name>
>   <value>r1</value>
> </property>
> <property>
>   <name>dfs.namenode.rpc-address.ns-fed.r1</name>
>   <value>node1.com:8888</value>
> </property>
> <property>
>   <name>dfs.ha.namenodes.mycluster1</name>
>   <value>nn1,nn2</value>
> </property>
> <property>
>   <name>dfs.namenode.http-address.mycluster1.nn1</name>
>   <value>node1.com:50070</value>
> </property>
> <property>
>   <name>dfs.namenode.http-address.mycluster1.nn2</name>
>   <value>node2.com:50070</value>
> </property><property>
>   <name>dfs.ha.namenodes.mycluster2</name>
>   <value>nn1,nn2</value>
> </property>
> <property>
>   <name>dfs.namenode.http-address.mycluster2.nn1</name>
>   <value>node3.com:50070</value>
> </property>
> <property>
>   <name>dfs.namenode.http-address.mycluster2.nn2</name>
>   <value>node4.com:50070</value>
> </property><property>
>   <name>dfs.client.failover.proxy.provider.ns-fed</name>
>   
> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
> </property>
> <property>
>   <name>dfs.client.failover.random.order</name>
>   <value>true</value>
> </property> {code}
>  
> Solution
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the 
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to 
> avoid this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to