[
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285750#comment-14285750
]
Kihwal Lee commented on HDFS-7433:
----------------------------------
The test failure needs to be further investigated. It failed after the second
registration of {{someStorageID4762}} using a different version. Up to
iteration 391, the version count was fine, so it looks like the second
registration somehow did not update the version count correctly or different
threads are seeing a stale value.
First registration using version1.
{noformat}
2015-01-20 17:00:55,048 INFO blockmanagement.TestDatanodeManager
(TestDatanodeManager.java:testNumVersionsReportedCorrect(121))
- Registering node storageID: someStorageID4762, version: version1, IP
address: someIPsomeStorageID4762:9000
2015-01-20 17:00:55,048 INFO hdfs.StateChange
(DatanodeManager.java:registerDatanode(924)) - BLOCK* registerDatanode: from
Mock
for DatanodeRegistration, hashCode: 1098128995 storage someStorageID4762
2015-01-20 17:00:55,050 INFO blockmanagement.DatanodeDescriptor
(DatanodeDescriptor.java:updateHeartbeatState(386)) - Number of
failed storage changes from 0 to 0
2015-01-20 17:00:55,312 INFO net.NetworkTopology
(NetworkTopology.java:add(418))
- Adding a new node: /default-rack/someIPsomeStorageID4762:9000
{noformat}
Second registration using version2 and the failure, "Still in map: version1 has
1".
{noformat}
2015-01-20 17:11:45,444 INFO blockmanagement.TestDatanodeManager
(TestDatanodeManager.java:testNumVersionsReportedCorrect(121))
- Registering node storageID: someStorageID4762, version: version2, IP
address: someIPsomeStorageID4762:9000
2015-01-20 17:11:45,444 INFO hdfs.StateChange
(DatanodeManager.java:registerDatanode(924)) - BLOCK* registerDatanode: from
Mock for
DatanodeRegistration, hashCode: 1098128995 storage someStorageID4762
2015-01-20 17:11:45,444 INFO net.NetworkTopology
(NetworkTopology.java:remove(487))
- Removing a node: /default-rack/someIPsomeStorageID4762:9000
2015-01-20 17:11:50,742 INFO net.NetworkTopology
(NetworkTopology.java:add(418))
- Adding a new node: /default-rack/someIPsomeStorageID4762:9000
2015-01-20 17:11:50,759 INFO blockmanagement.TestDatanodeManager
(TestDatanodeManager.java:testNumVersionsReportedCorrect(147))
- Still in map: version1 has 1
{noformat}
I used the same random seed of 209693745, but could not reproduce the issue. It
must be a rare race.
> DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize
> lookup performance
> ----------------------------------------------------------------------------------------------
>
> Key: HDFS-7433
> URL: https://issues.apache.org/jira/browse/HDFS-7433
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: namenode
> Affects Versions: 2.0.0-alpha, 3.0.0
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Priority: Critical
> Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch
>
>
> The datanode map is currently a {{TreeMap}}. For many thousands of
> datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.
> Insertions and removals are up to 100X more expensive.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)