[
https://issues.apache.org/jira/browse/HDFS-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17811892#comment-17811892
]
xiaojunxiang edited comment on HDFS-17356 at 1/31/24 2:56 AM:
--------------------------------------------------------------
[~tasanuma], [~hiwangzhihui] [~hexiaoqiao] thanks yours advise, I have found
three solutions to this problem, and now I will summarize them.
1. Option 1: Configure dfs.nameservice.id=<current ns> or
dfs.ha.namenode.id=<current nn>
- advantage: No new development is needed, RouterServer and NameNode and
HDFSClient can be deployed on the same node
- disadvantage: Different nodes need to have different dfs.nameservice.id
configurations
2. Option 2: Develop new configuration "dfs.federation.router.ns.name" as
suggested by jira.
- advantage: RouterServer and NameNode and HDFSClient can be deployed on the
same node,different nodes can use the same configurations
- disadvantage: Need new development
3. Option 3: Constrained deployment pattern,the Router and NameNode are need
deployed on different nodes
- advantage: No new development is needed, different nodes can use the same
configurations
- disadvantage: The Router and NameNode are need deployed on different nodes
!image-2024-01-31-10-56-23-399.png!
was (Author: JIRAUSER300087):
[~tasanuma], [~hiwangzhihui] [~hexiaoqiao] thanks yours advise, I have found
three solutions to this problem, and now I will summarize them.
1. Option 1: Configure dfs.nameservice.id=<current ns> or
dfs.ha.namenode.id=<current nn>
- advantage: No new development is needed, RouterServer and NameNode and
HDFSClient can be deployed on the same node
- disadvantage: Different nodes need to have different dfs.nameservice.id
configurations
2. Option 2: Develop new configuration "dfs.federation.router.ns.name" as
suggested by jira.
- advantage: RouterServer and NameNode and HDFSClient can be deployed on the
same node,different nodes can use the same configurations
- disadvantage: Need new development
3. Option 3: Constrained deployment pattern,the Router and NameNode are need
deployed on different nodes
- advantage: No new development is needed, different nodes can use the same
configurations
- disadvantage: The Router and NameNode are need deployed on different nodes
!image-2024-01-29-22-09-43-263.png!
> RBF: Add Configuration dfs.federation.router.ns.name Optimization
> -----------------------------------------------------------------
>
> Key: HDFS-17356
> URL: https://issues.apache.org/jira/browse/HDFS-17356
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: dfs, rbf
> Reporter: wangzhihui
> Priority: Minor
> Attachments: image-2024-01-29-18-04-55-391.png,
> image-2024-01-29-22-09-43-263.png, image-2024-01-31-10-56-23-399.png,
> screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png,
> screenshot-5.png
>
>
> When enabling RBF federation in HDFS, when the HDFS server and RBFClient
> share the same configuration and the HDFS server (NameNode、ZKFC) and
> RBFClient are on the same node, the following exception occurs, causing
> NameNode to fail to start; The reason is that the NS of the Router service
> has been added to the dfs.nameservices list. When NameNode starts, it obtains
> the NS that the current node belongs to. However, it is found that there are
> multiple NS that cannot be recognized and cannot pass the verification of
> existing logic, ultimately resulting in NameNode startup failure. Currently,
> we can only solve this problem by isolating the hdfs-site.xml of RouterClient
> and NameNode. However, grouping configuration is not conducive to our unified
> management of cluster configuration. Therefore, we propose a new solution to
> solve this problem better.
> {code:java}
> // code placeholder
> 2023-10-30 15:53:24,613 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
> registered UNIX signal handlers for [TERM, HUP, INT]
> 2023-10-30 15:53:24,672 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
> createNameNode []
> 2023-10-30 15:53:24,760 INFO org.apache.hadoop.metrics2.impl.MetricsConfig:
> Loaded properties from hadoop-metrics2.properties
> 2023-10-30 15:53:24,842 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot
> period at 10 second(s).
> 2023-10-30 15:53:24,842 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
> started
> 2023-10-30 15:53:24,868 ERROR
> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
> org.apache.hadoop.HadoopIllegalArgumentException: Configuration has multiple
> addresses that match local node's address. Please configure the system with
> dfs.nameservice.id and dfs.ha.namenode.id
> at org.apache.hadoop.hdfs.DFSUtil.getSuffixIDs(DFSUtil.java:1257)
> at org.apache.hadoop.hdfs.DFSUtil.getNameServiceId(DFSUtil.java:1158)
> at
> org.apache.hadoop.hdfs.DFSUtil.getNamenodeNameServiceId(DFSUtil.java:1113)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.getNameServiceId(NameNode.java:1822)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1005)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:995)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1769)
> at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1834)
> 2023-10-30 15:53:24,870 INFO org.apache.hadoop.util.ExitUtil: Exiting with
> status 1: org.apache.hadoop.HadoopIllegalArgumentException: Configuration has
> multiple addresses that match local node's address. Please configure the
> system with dfs.nameservice.id and dfs.ha.name
> node.id
> 2023-10-30 15:53:24,874 INFO org.apache.hadoop.hdfs.server.namenode.NameNode:
> SHUTDOWN_MSG: {code}
>
> hdfs-site.xml
> {code:java}
> // code placeholder
> <property>
> <name>dfs.nameservices</name>
> <value>mycluster1,mycluster2,ns-fed</value>
> </property><property>
> <name>dfs.ha.namenodes.ns-fed</name>
> <value>r1</value>
> </property>
> <property>
> <name>dfs.namenode.rpc-address.ns-fed.r1</name>
> <value>node1.com:8888</value>
> </property>
> <property>
> <name>dfs.ha.namenodes.mycluster1</name>
> <value>nn1,nn2</value>
> </property>
> <property>
> <name>dfs.namenode.http-address.mycluster1.nn1</name>
> <value>node1.com:50070</value>
> </property>
> <property>
> <name>dfs.namenode.http-address.mycluster1.nn2</name>
> <value>node2.com:50070</value>
> </property><property>
> <name>dfs.ha.namenodes.mycluster2</name>
> <value>nn1,nn2</value>
> </property>
> <property>
> <name>dfs.namenode.http-address.mycluster2.nn1</name>
> <value>node3.com:50070</value>
> </property>
> <property>
> <name>dfs.namenode.http-address.mycluster2.nn2</name>
> <value>node4.com:50070</value>
> </property><property>
> <name>dfs.client.failover.proxy.provider.ns-fed</name>
>
> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
> </property>
> <property>
> <name>dfs.client.failover.random.order</name>
> <value>true</value>
> </property> {code}
>
> Solution
> Add dfs.federation.router.ns.name configuration in hdfs-site.xml to mark the
> Router NS name. and filter out Router NS during NameNode or ZKFC startup to
> avoid this issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]