[
https://issues.apache.org/jira/browse/HDFS-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890967#action_12890967
]
Travis Crawford commented on HDFS-1120:
---------------------------------------
Moving this comment from my duplicate.
Filing this issue in response to ``full disk woes`` on hdfs-user.
Datanodes fill their storage directories unevenly, leading to situations where
certain disks are full while others are significantly less used. Users at many
different sites have experienced this issue, and HDFS administrators are taking
steps like:
- Manually rebalancing blocks in storage directories
- Decomissioning nodes & later readding them
There's a tradeoff between making use of all available spindles, and filling
disks at the sameish rate. Possible solutions include:
- Weighting less-used disks heavier when placing new blocks on the datanode. In
write-heavy environments this will still make use of all spindles, equalizing
disk use over time.
- Rebalancing blocks locally. This would help equalize disk use as disks are
added/replaced in older cluster nodes.
Datanodes should actively manage their local disk so operator intervention is
not needed.
> Make DataNode's block-to-device placement policy pluggable
> ----------------------------------------------------------
>
> Key: HDFS-1120
> URL: https://issues.apache.org/jira/browse/HDFS-1120
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: data-node
> Reporter: Jeff Hammerbacher
>
> As discussed on the mailing list, as the number of disk drives per server
> increases, it would be useful to allow the DataNode's policy for new block
> placement to grow in sophistication from the current round-robin strategy.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.