[
https://issues.apache.org/jira/browse/HDDS-2199?focusedWorklogId=321800&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-321800
]
ASF GitHub Bot logged work on HDDS-2199:
----------------------------------------
Author: ASF GitHub Bot
Created on: 02/Oct/19 10:39
Start Date: 02/Oct/19 10:39
Worklog Time Spent: 10m
Work Description: sodonnel commented on pull request #1551: HDDS-2199 In
SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
URL: https://github.com/apache/hadoop/pull/1551#discussion_r330481300
##########
File path:
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/server/SCMBlockProtocolServer.java
##########
@@ -295,7 +297,33 @@ public ScmInfo getScmInfo() throws IOException {
boolean auditSuccess = true;
try{
NodeManager nodeManager = scm.getScmNodeManager();
- Node client = nodeManager.getNodeByAddress(clientMachine);
Review comment:
I am not certain how this sortDatanodes() call is used. Is it on the read
path or write path? I was assuming it was on the read path, but write path may
be different if all the cluster DNs are passed into the method - then you would
always get a match.
A list of DNs (UUIDs) are passed into the method, and then we retrieve a
list of DatanodeDetails running on the client machine. The client machine can
then be set to one of those DatanodeDetails, but it is not guaranteed that the
first in the list will match on of the UUIDs passed into the method.
Eg this is passed in:
DN0, DN5, DN10, DN15
On the client machine is:
DN1, DN6, DN10 and DN16
So only DN10 is a match with one that is passed it. If we just picked the
first one (DN1) it would look like there is no DN on the client machine and
then when the list and client machine are passed into sortByDistanceCost() at
line 355, it would not give the expected result.
The method receives a list of DNs, identified by UUID.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 321800)
Time Spent: 2h 10m (was: 2h)
> In SCMNodeManager dnsToUuidMap cannot track multiple DNs on the same host
> -------------------------------------------------------------------------
>
> Key: HDDS-2199
> URL: https://issues.apache.org/jira/browse/HDDS-2199
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Affects Versions: 0.5.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Labels: pull-request-available
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> Often in test clusters and tests, we start multiple datanodes on the same
> host.
> In SCMNodeManager.register() there is a map of hostname -> datanode UUID
> called dnsToUuidMap.
> If several DNs register from the same host, the entry in the map will be
> overwritten and the last DN to register will 'win'.
> This means that the method getNodeByAddress() does not return the correct
> DatanodeDetails object when many hosts are registered from the same address.
> This method is only used in SCMBlockProtocolServer.sortDatanodes() to allow
> it to see if one of the nodes matches the client, but it need to be used by
> the Decommission code.
> Perhaps we could change the getNodeByAddress() method to returns a list of
> DNs? In normal production clusters, there should only be one returned, but in
> test clusters, there may be many. Any code looking for a specific DN entry
> would need to iterate the list and match on the port number too, as host:port
> would be the unique definition of a datanode.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]