[ 
https://issues.apache.org/jira/browse/HDFS-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16060135#comment-16060135
 ] 

Kihwal Lee commented on HDFS-12008:
-----------------------------------

2.8 actually seems fine. If I spread nodes evenly, I get the values closer to 
ideal.

Again, when space balancing 100% of times, the chance of picking underutilized 
nodes is 75%.  If no space balancing, it is 50% as 50% of nodes in the cluster 
are underutilized.  Thus, setting 
{{dfs.namenode.available-space-block-placement-policy.balanced-space-preference-fraction}}
 to 0.6f as the test does changes the end result to

  0.4*0.5 + 0.6*0.75 = 0.65 or 65%

After spreading the nodes across the racks in the test, this is the result 
(i.e. the value of {{possibility}}).
||   || 1.0f || 0.6f ||
| ideal | 75% | 65% |
| branch-2.8 w/patch | 72.1% | 63.4%  |
| branch-2.8 as is | 72.1% | 54.1%  |
| trunk w/patch | 68.7% | 59.7%  |
| trunk as is | 67.9% | 53.9% |

The ideal value assumes the picking node is completely random.
- The patch makes it closer to the ideal. Probably because it cuts down the 
number of random number generations?
- Trunk seems farther from the ideal, meaning less random compared to 
branch-2.8.


> Improve the available-space block placement policy
> --------------------------------------------------
>
>                 Key: HDFS-12008
>                 URL: https://issues.apache.org/jira/browse/HDFS-12008
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: block placement
>    Affects Versions: 2.8.1
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-12008.patch
>
>
> AvailableSpaceBlockPlacementPolicy currently picks two nodes unconditionally, 
> then picks one node. It could avoid picking the second node when not 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to