[ 
https://issues.apache.org/jira/browse/HDFS-7433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285750#comment-14285750
 ] 

Kihwal Lee commented on HDFS-7433:
----------------------------------

The test failure needs to be further investigated. It failed after the second 
registration of {{someStorageID4762}} using a different version. Up to 
iteration 391, the version count was fine, so it looks like the second 
registration somehow did not update the version count correctly or different 
threads are seeing a stale value.

First registration using version1.

{noformat}
2015-01-20 17:00:55,048 INFO  blockmanagement.TestDatanodeManager 
(TestDatanodeManager.java:testNumVersionsReportedCorrect(121))
 - Registering node storageID: someStorageID4762, version: version1, IP 
address: someIPsomeStorageID4762:9000
2015-01-20 17:00:55,048 INFO  hdfs.StateChange 
(DatanodeManager.java:registerDatanode(924)) - BLOCK* registerDatanode: from 
Mock
 for DatanodeRegistration, hashCode: 1098128995 storage someStorageID4762
2015-01-20 17:00:55,050 INFO  blockmanagement.DatanodeDescriptor 
(DatanodeDescriptor.java:updateHeartbeatState(386)) - Number of
 failed storage changes from 0 to 0
2015-01-20 17:00:55,312 INFO  net.NetworkTopology 
(NetworkTopology.java:add(418))
 - Adding a new node: /default-rack/someIPsomeStorageID4762:9000
{noformat}

Second registration using version2 and the failure, "Still in map: version1 has 
1".

{noformat}
2015-01-20 17:11:45,444 INFO  blockmanagement.TestDatanodeManager 
(TestDatanodeManager.java:testNumVersionsReportedCorrect(121))
 - Registering node storageID: someStorageID4762, version: version2, IP 
address: someIPsomeStorageID4762:9000
2015-01-20 17:11:45,444 INFO  hdfs.StateChange 
(DatanodeManager.java:registerDatanode(924)) - BLOCK* registerDatanode: from 
Mock for
 DatanodeRegistration, hashCode: 1098128995 storage someStorageID4762
2015-01-20 17:11:45,444 INFO  net.NetworkTopology 
(NetworkTopology.java:remove(487))
 - Removing a node: /default-rack/someIPsomeStorageID4762:9000
2015-01-20 17:11:50,742 INFO  net.NetworkTopology 
(NetworkTopology.java:add(418))
 - Adding a new node: /default-rack/someIPsomeStorageID4762:9000
2015-01-20 17:11:50,759 INFO  blockmanagement.TestDatanodeManager 
(TestDatanodeManager.java:testNumVersionsReportedCorrect(147))
 - Still in map: version1 has 1
{noformat}

I used the same random seed of 209693745, but could not reproduce the issue. It 
must be a rare race.

> DatanodeManager#datanodeMap should be a HashMap, not a TreeMap, to optimize 
> lookup performance
> ----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7433
>                 URL: https://issues.apache.org/jira/browse/HDFS-7433
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-7433.patch, HDFS-7433.patch, HDFS-7433.patch
>
>
> The datanode map is currently a {{TreeMap}}.  For many thousands of 
> datanodes, tree lookups are ~10X more expensive than a {{HashMap}}.  
> Insertions and removals are up to 100X more expensive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to