[
https://issues.apache.org/jira/browse/HDFS-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12863740#action_12863740
]
Steve Loughran commented on HDFS-1120:
--------------------------------------
I think the probability gets larger the more disks/server, and now that 12HDD
units are coming out, you can plan to see it some time after you spec out your
next datacentre.
Causes
# deletion of large block size files can leave a disk unbalanced.
# MR temp space in the same disks can fill up then free disks
# Replacement of failed HDDs leaves that disk permanently underutilised.
the third one is new; on a 12 disk server, with most of all 12 disks allocated
to HDFS, one block in 12 would go to any specific disk. If one disk is
replaced, it still only gets 1/12 of the blocks, even though if all the other
disks were 70-80% full, its the disk with the most space. The disks would only
be balanced if the new disk got more of the writes (which could have adverse
consequences for future IO rates), or some rebalancing on a single machine
moves data from one disk to another (or to be precise, copies, validates the
block checksums, then deletes).
I actually think HDFS-1121 should come first: provide a way of measuring the
distribution on disks on a single server. Once we have the data we can start
worrying about ways to correct any distribution issues.
> Make DataNode's block-to-device placement policy pluggable
> ----------------------------------------------------------
>
> Key: HDFS-1120
> URL: https://issues.apache.org/jira/browse/HDFS-1120
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: data-node
> Reporter: Jeff Hammerbacher
>
> As discussed on the mailing list, as the number of disk drives per server
> increases, it would be useful to allow the DataNode's policy for new block
> placement to grow in sophistication from the current round-robin strategy.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.