[jira] [Commented] (HDFS-1312) Re-balance disks within a Datanode

Andrew Wang (JIRA) Fri, 08 Jan 2016 12:25:00 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089877#comment-15089877
 ]


Andrew Wang commented on HDFS-1312:
-----------------------------------

Hi Anu, some replies:

bq. Generally administrators are wary of enabling a feature like HDFS-1804 in a 
production cluster. For new clusters it is more easier but for existing 
production clusters assuming the existence of HDFS-1804 is not realistic.

I don't follow this line of reasoning; don't concerns about using a new feature 
apply to a hypothetical HDFS-1312 implementation too?

HDFS-1804 also was fixed in 2.1.0, so almost everyone should have it available. 
It's also been in use for years, so it's pretty stable.

bq. we do lose one the critical feature of the tool, that is ability to report 
what we did to the machine

Why do we lose this? Can't the DN dump this somewhere?

bq. We wanted to merge mover into this engine later...

This is an interesting point I was not aware of. Is the goal here to do 
inter-DN moving? If so, we have a long-standing issue with inter-DN balancing, 
which is that the balancer as an external process is not aware of the NN's 
block placement policies, leading to placement violations. This is something 
[~mingma] and [~ctrezzo] brought up; if we're doing a rewrite of this 
functionality, it should probably be in the NN.

If it's only for intra-DN moving, then it could still live in the DN.

bq. Two issues with that, one there are lots of customers without HDFS-1804, 
and HDFS-1804 is just an option that user can choose.

Almost everyone is running a version of HDFS with HDFS-1804 these days. As I 
said in my previous comment, if a cluster is commonly hitting imbalance, 
enabling HDFS-1804 should be the first step since a) it's already available and 
b) it avoids the imbalance in the first place, which better conserves IO 
bandwidth.

This is also why I brought up HDFS-8538. If HDFS-1804 is the default volume 
choosing policy, we won't see imbalance outside of hotswap.

bq. Getting an alert due to low space on disk from datanode is very 
reactive.... it is common enough problem that I think it should be solved at 
HDFS level.

The point I was trying to make is that HDFS-1804 addresses the imbalance issues 
besides hotswap, so we eliminate the alerts in the first place. Hotswap is an 
operation explictly undertaken by the admin, so the admin will know to also run 
the intra-DN balancer. There's no monitoring system in the loop.

bq. I prefer to debug by looking at my local directory instead of ssh-ing into 
a datanode...

This is an aspirational goal, but when debugging a prod cluster we almost 
certainly also want to see the DN log too, which is local to the DN. Cluster 
management systems also make log collection pretty easy, so this seems minor.

Would it help to have a phone call about this? We have a lot of points flying 
around, might be easier to settle this via a higher-bandwidth medium.

> Re-balance disks within a Datanode
> ----------------------------------
>
>                 Key: HDFS-1312
>                 URL: https://issues.apache.org/jira/browse/HDFS-1312
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode
>            Reporter: Travis Crawford
>            Assignee: Anu Engineer
>         Attachments: Architecture_and_testplan.pdf, disk-balancer-proposal.pdf
>
>
> Filing this issue in response to ``full disk woes`` on hdfs-user.
> Datanodes fill their storage directories unevenly, leading to situations 
> where certain disks are full while others are significantly less used. Users 
> at many different sites have experienced this issue, and HDFS administrators 
> are taking steps like:
> - Manually rebalancing blocks in storage directories
> - Decomissioning nodes & later readding them
> There's a tradeoff between making use of all available spindles, and filling 
> disks at the sameish rate. Possible solutions include:
> - Weighting less-used disks heavier when placing new blocks on the datanode. 
> In write-heavy environments this will still make use of all spindles, 
> equalizing disk use over time.
> - Rebalancing blocks locally. This would help equalize disk use as disks are 
> added/replaced in older cluster nodes.
> Datanodes should actively manage their local disk so operator intervention is 
> not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-1312) Re-balance disks within a Datanode

Reply via email to