[ https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060135#comment-16060135 ]
Kihwal Lee commented on HDFS-12008: ----------------------------------- 2.8 actually seems fine. If I spread nodes evenly, I get the values closer to ideal. Again, when space balancing 100% of times, the chance of picking underutilized nodes is 75%. If no space balancing, it is 50% as 50% of nodes in the cluster are underutilized. Thus, setting {{dfs.namenode.available-space-block-placement-policy.balanced-space-preference-fraction}} to 0.6f as the test does changes the end result to 0.4*0.5 + 0.6*0.75 = 0.65 or 65% After spreading the nodes across the racks in the test, this is the result (i.e. the value of {{possibility}}). || || 1.0f || 0.6f || | ideal | 75% | 65% | | branch-2.8 w/patch | 72.1% | 63.4% | | branch-2.8 as is | 72.1% | 54.1% | | trunk w/patch | 68.7% | 59.7% | | trunk as is | 67.9% | 53.9% | The ideal value assumes the picking node is completely random. - The patch makes it closer to the ideal. Probably because it cuts down the number of random number generations? - Trunk seems farther from the ideal, meaning less random compared to branch-2.8. > Improve the available-space block placement policy > -------------------------------------------------- > > Key: HDFS-12008 > URL: https://issues.apache.org/jira/browse/HDFS-12008 > Project: Hadoop HDFS > Issue Type: Bug > Components: block placement > Affects Versions: 2.8.1 > Reporter: Kihwal Lee > Assignee: Kihwal Lee > Attachments: HDFS-12008.patch > > > AvailableSpaceBlockPlacementPolicy currently picks two nodes unconditionally, > then picks one node. It could avoid picking the second node when not > necessary. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org