[
https://issues.apache.org/jira/browse/HDFS-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron T. Myers updated HDFS-1804:
---------------------------------
Attachment: HDFS-1804.patch
Here's a patch which addresses the issue by adding a new volume choose policy
called "AvailableSpaceVolumeChoosingPolicy".
This policy works by first determining if all the free space of all the volumes
are within some configurable range, by default 10GB. If they are balanced in
this way, then assignments are made in a strictly round robin fashion. If the
available free space is not balanced across all available volumes, the volumes
are bucketed as either having a lot or a little free space. We then choose to
allocate a block to one of these buckets of volumes randomly with a
configurable frequency, and within one of these two buckets we allocate blocks
on a round robin basis.
This scheme allows administrators to control both their threshold for what they
consider "balanced" disks and how much they're willing to impact overall
concurrent write throughput to the node vs. their desire to get volumes quickly
balanced again.
In addition to the unit tests in the patch, I also manually tested this on a
single-node cluster with 4 DN volumes. It worked as expected from a correctness
point of view. From a performance point of view, there was no discernible
performance impact both when all volumes were considered balanced, or in the
case of imbalanced volumes but in the absence of concurrent writes.
> Modify or add a new block-volume device choosing policy that looks at free
> space
> --------------------------------------------------------------------------------
>
> Key: HDFS-1804
> URL: https://issues.apache.org/jira/browse/HDFS-1804
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode
> Reporter: Harsh J
> Priority: Minor
> Labels: newbie
> Attachments: HDFS-1804.patch
>
>
> HDFS-1120 introduced pluggable block-volume choosing policies, but still
> carries the vanilla RoundRobin as its default.
> An additional implementation that also takes into consideration the free
> space remaining on the disk (or other params) should be a good addition as an
> alternative to vanilla RR.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira