[
https://issues.apache.org/jira/browse/HDFS-5208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770910#comment-13770910
]
Junping Du commented on HDFS-5208:
----------------------------------
Hi Colin, I think you are right that DatanodeID is created in DN heartbeat to
NN for registration and its hostName comes from conf of "dfs.datanode.hostname"
which can be any style but DNS name if this config is not setting.
However, following code in resolveNetworkLocation() called by
DatanodeManager.registerDatanode() make only IPs are cached through DN
registration. Isn't it?
{code}
if (dnsToSwitchMapping instanceof CachedDNSToSwitchMapping) {
names.add(node.getIpAddr());
} else {
names.add(node.getHostName());
}
{code}
Actually, now I am worrying about non-cached case, as even topology script can
resolve user-specified "hostName" to correct network location (rack) properly
and use it to register into networktopology tree. Later, it still need to
resolve topology based on nodes' IP (like in
DatanodeManager.sortLocatedBlocks()) which means script must contains both
user-specified "hostName" and IP for each node. IMO, This is really unnecessary
and confusing. Thoughts?
> Only clear network location cache on specific nodes if invalid
> NetworkTopology happens
> --------------------------------------------------------------------------------------
>
> Key: HDFS-5208
> URL: https://issues.apache.org/jira/browse/HDFS-5208
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Junping Du
> Assignee: Junping Du
> Attachments: HDFS-5208-v1.patch
>
>
> After HDFS-4521, once a DN is registered with invalid networktopology, all
> cached rack info in DNSToSwitchMapping will be cleared. We should only clear
> cache on specific nodes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira