[
https://issues.apache.org/jira/browse/HDDS-2199?focusedWorklogId=322775&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-322775
]
ASF GitHub Bot logged work on HDDS-2199:
----------------------------------------
Author: ASF GitHub Bot
Created on: 03/Oct/19 17:35
Start Date: 03/Oct/19 17:35
Worklog Time Spent: 10m
Work Description: sodonnel commented on pull request #1551: HDDS-2199 In
SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
URL: https://github.com/apache/hadoop/pull/1551#discussion_r331163439
##########
File path:
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMBlockProtocolServer.java
##########
@@ -295,7 +297,33 @@ public ScmInfo getScmInfo() throws IOException {
boolean auditSuccess = true;
try{
NodeManager nodeManager = scm.getScmNodeManager();
- Node client = nodeManager.getNodeByAddress(clientMachine);
Review comment:
Reflecting on this issue some more, I think the simplified logic you have
suggested is better and the problem is better solved in getDistanceByCost -
rather than comparing just the node objects are the same, we should test if
they are the same hostname and if so treat that as a zero distance match too.
Unfortunately, as that method takes Node objects rather than
DatanodeDetails, this is not trivial to do.
The code path under question here is only relevant for clusters with more
than one datanode on the same host, and by definition that is a non-production
setup. The only consequence of the change you have suggested over my original
code, is that the client may get the wrong 'cost to reach a datanode' sometimes
on test clusters - nothing will fail, so the impact of this issue is very low.
Therefore if you are happy, I think we should commit the latest version
(which has your simplified logic) and create a followup Jira to look into
fixing getDistanceByCost somehow.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 322775)
Time Spent: 4h 20m (was: 4h 10m)
> In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
> -------------------------------------------------------------------------
>
> Key: HDDS-2199
> URL: https://issues.apache.org/jira/browse/HDDS-2199
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Affects Versions: 0.5.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Labels: pull-request-available
> Time Spent: 4h 20m
> Remaining Estimate: 0h
>
> Often in test clusters and tests, we start multiple datanodes on the same
> host.
> In SCMNodeManager.register() there is a map of hostname -> datanode UUID
> called dnsToUuidMap.
> If several DNs register from the same host, the entry in the map will be
> overwritten and the last DN to register will 'win'.
> This means that the method getNodeByAddress() does not return the correct
> DatanodeDetails object when many hosts are registered from the same address.
> This method is only used in SCMBlockProtocolServer.sortDatanodes() to allow
> it to see if one of the nodes matches the client, but it need to be used by
> the Decommission code.
> Perhaps we could change the getNodeByAddress() method to returns a list of
> DNs? In normal production clusters, there should only be one returned, but in
> test clusters, there may be many. Any code looking for a specific DN entry
> would need to iterate the list and match on the port number too, as host:port
> would be the unique definition of a datanode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]