[ 
https://issues.apache.org/jira/browse/HDFS-8538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574914#comment-14574914
 ] 

Andrew Wang commented on HDFS-8538:
-----------------------------------

Thanks guys for the discussion, some replies:

bq. You will get more complaints for performance degradation after the change. 
BTW, they should set the policy themselves or run balancer.

I filed this because we've had tens of customers run into issues and not know 
about the AvailableSpace policy. These issues include:

* Small disks filling up first and becoming unavailable for write, leading to 
poor performance
* Newly inserted disks having less data, leading to access skew
* Monitoring warnings from the disks being very full (90%+)
* Upgrade issues due to lack of free space on the very full disks

The normal balancer IIUC fixes inter-node balance but does not address 
intra-node balance. There's been a intra-node balancer shell script floating 
around the internet for a while, but I don't know if it's been updated for the 
new block-id based layout. It's also a hacky approach we don't want to support 
in mainline, since it requires shutting down the DN and manually moving blocks 
around.

My experience has been that users of heterogeneous sized disks almost always 
use this policy. No users thus far have reported performance problems with the 
AvailableSpace policy. Harsh actually recommended making it the default policy 
in the original JIRA, but we deferred to let the code bake first.

Note also that heterogeneous sized disks are the rare case, most DNs are 
homogeneous. Since AvailableSpace falls back to RR if the disks are mostly 
balanced, homogeneous DNs should be unaffected.

Related, there's also been user demand for an available space block placement 
policy, leading to the recent implementation of the HDFS-8131. Balancer

bq. If that's correct it's still possible that just one or a small number of 
volumes would fall into the higher bucket and get overloaded.

This leads me to an potential enhancement: count the # of outstanding writes to 
a low-capacity disk, and exclude it from skewed placement if it's got too many 
outstanding writes. This would be even better if we used OS-level IO 
statistics, but that could be a follow-on.

Nicholas + Arpit, would the above satisfy your concerns about disk overload? It 
also might be a good opportunity to do the relative free space enhancement 
recommended by Chris N.

> Change the default volume choosing policy to 
> AvailableSpaceVolumeChoosingPolicy
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-8538
>                 URL: https://issues.apache.org/jira/browse/HDFS-8538
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.7.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-8538.001.patch
>
>
> For datanodes with different sized disks, they almost always want the 
> available space policy. Users with homogenous disks are unaffected.
> Since this code has baked for a while, let's change it to be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to