[jira] [Commented] (HDFS-1312) Re-balance disks within a Datanode

Aaron T. Myers (JIRA) Mon, 05 Oct 2015 16:59:02 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944280#comment-14944280
 ]


Aaron T. Myers commented on HDFS-1312:
--------------------------------------

bq. There are a large number of clusters which are using round-robin 
scheduling. I have been looking around for the data on HDFS-1804 (In fact the 
proposal discusses that issue). It will be good if you have some data on 
HDFS-1804 deployment. Most of the clusters that I am (anecdotal, I know) seeing 
are based on round robin scheduling. Please also see the thread I refer to in 
the proposal on linkedin, and you will see customers are also looking for this 
data.

Presumably that's mostly because the round-robin volume choosing policy is the 
default, and many users don't even know that there's an alternative. 
Independently of the need to implement an active balancer as this JIRA 
proposes, should we consider changing the default to the available space volume 
choosing policy? We'd probably need to do this in Hadoop 3.0, since I'd think 
this should be considered an incompatible change.

> Re-balance disks within a Datanode
> ----------------------------------
>
>                 Key: HDFS-1312
>                 URL: https://issues.apache.org/jira/browse/HDFS-1312
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode
>            Reporter: Travis Crawford
>         Attachments: disk-balancer-proposal.pdf
>
>
> Filing this issue in response to ``full disk woes`` on hdfs-user.
> Datanodes fill their storage directories unevenly, leading to situations 
> where certain disks are full while others are significantly less used. Users 
> at many different sites have experienced this issue, and HDFS administrators 
> are taking steps like:
> - Manually rebalancing blocks in storage directories
> - Decomissioning nodes & later readding them
> There's a tradeoff between making use of all available spindles, and filling 
> disks at the sameish rate. Possible solutions include:
> - Weighting less-used disks heavier when placing new blocks on the datanode. 
> In write-heavy environments this will still make use of all spindles, 
> equalizing disk use over time.
> - Rebalancing blocks locally. This would help equalize disk use as disks are 
> added/replaced in older cluster nodes.
> Datanodes should actively manage their local disk so operator intervention is 
> not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-1312) Re-balance disks within a Datanode

Reply via email to