[
https://issues.apache.org/jira/browse/HDFS-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13484949#comment-13484949
]
Todd Lipcon commented on HDFS-4113:
-----------------------------------
I have discussed this with other contributors in the past, and we've generally
come to the conclusion that it's not really what most people want. Heres why:
- Each node has some finite write capacity due to network bandwidth (say
110MB/sec)
- If you want to fill up the smaller nodes at half the rate of the bigger
nodes, that implies that the smaller nodes are only filling up at 50MB/sec
- Hence you are only using half of your available write bandwidth for all the
small nodes in the cluster.
Given that, it seems far preferable to use the existing strategy:
- Write to all nodes at equal rate
- Run the balancer continuously, so that the nodes that are more
percentage-full transfer blocks to the emptier nodes in the background.
This allows full write throughput during jobs and then uses "downtime" in the
cluster to balance things back out. Given that most clusters have some spare
capacity in the background for balancing, this is a much better strategy then
the proposed allocation scheme.
> When adding datanodes with less disk capacity to an existing cluster, the new
> DNs fill up faster and subsequently cause errors during put operations
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-4113
> URL: https://issues.apache.org/jira/browse/HDFS-4113
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: data-node
> Reporter: Stephen Fritz
> Priority: Minor
>
> The request is that the allocation strategy be modified so that it allocates
> equally on a 'free space percentage' basis between datanodes. IE disks that
> are twice as big should have twice as much data written to them per unit time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira