[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

Zhe Zhang (JIRA) Fri, 07 Oct 2016 15:14:39 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-10967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15556406#comment-15556406
 ]


Zhe Zhang commented on HDFS-10967:
----------------------------------

[~mingma] [~kihwal] [~daryn] I'd like to hear your opinions based on operating 
large clusters (not sure if you have the heterogeneity issue though). In our 
case, the Balancer is essentially fighting with application data ingestion. 
Triaging the 2nd and 3rd replicas at ingestion time should be better than 
spending the bandwidth to move them later.

> Add configuration for BlockPlacementPolicy to avoid near-full DataNodes
> -----------------------------------------------------------------------
>
>                 Key: HDFS-10967
>                 URL: https://issues.apache.org/jira/browse/HDFS-10967
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>              Labels: balancer
>         Attachments: HDFS-10967.00.patch, HDFS-10967.poc.patch
>
>
> Large production clusters are likely to have heterogeneous nodes in terms of 
> storage capacity, memory, and CPU cores. It is not always possible to 
> proportionally ingest data into DataNodes based on their remaining storage 
> capacity. Therefore it's possible for a subset of DataNodes to be much closer 
> to full capacity than the rest.
> This heterogeneity is most likely rack-by-rack -- i.e. _m_ whole racks of 
> low-storage nodes and _n_ whole racks of high-storage nodes. So It'd be very 
> useful if we can lower the chance for those near-full DataNodes to become 
> destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10967) Add configuration for BlockPlacementPolicy to avoid near-full DataNodes

Reply via email to