[
https://issues.apache.org/jira/browse/HDFS-15919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308352#comment-17308352
]
Xiaoqiao He commented on HDFS-15919:
------------------------------------
+ cherry-pick to branch-3.1
> BlockPoolManager should log stack trace if unable to get Namenode addresses
> ---------------------------------------------------------------------------
>
> Key: HDFS-15919
> URL: https://issues.apache.org/jira/browse/HDFS-15919
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 3.4.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HDFS-15919.001.patch
>
>
> If the hdfs config is badly configured, the datanode can fail to start with
> this stack trace:
> {code}
> 2021-03-24 05:58:27,026 INFO datanode.DataNode
> (BlockPoolManager.java:refreshNamenodes(149)) - Refresh request received for
> nameservices: null
> 2021-03-24 05:58:27,033 WARN datanode.DataNode
> (BlockPoolManager.java:refreshNamenodes(161)) - Unable to get NameNode
> addresses.
> ...
> 2021-03-24 05:58:27,077 ERROR datanode.DataNode
> (DataNode.java:secureMain(2883)) - Exception in secureMain
> java.io.IOException: No services to connect, missing NameNode address.
> at
> org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.refreshNamenodes(BlockPoolManager.Java:165)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1440)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:500)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2782)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2690)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2732)
> at
> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2876)
> at
> org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:100)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243)
> {code}
> In this case, the issue was an exception thrown in
> DFSUtil.getNNServiceRpcAddressesForCluster(...) but there are a couple of
> scenarios within it which can cause an exception, so its difficult to figure
> out what is wrong with the config.
> We should simple add the exception onto the existing log message when an
> error occurs so it is clear what caused it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]