Zhe Zhang created HDFS-10967:
--------------------------------

             Summary: Add configuration for BlockPlacementPolicy to 
deprioritize near-full DataNodes
                 Key: HDFS-10967
                 URL: https://issues.apache.org/jira/browse/HDFS-10967
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: namenode
            Reporter: Zhe Zhang


Large production clusters are likely to have heterogeneous nodes in terms of 
storage capacity, memory, and CPU cores. It is not always possible to 
proportionally ingest data into DataNodes based on their remaining storage 
capacity. Therefore it's possible for a subset of DataNodes to be much closer 
to full capacity than the rest.

Notice that this heterogeneity is most likely rack-by-rack -- i.e. _m_ whole 
racks with low-storage nodes and _n_ whole racks with high-storage nodes. So 
It'd be very useful if we can deprioritize those near-full DataNodes as 
destinations for the 2nd and 3rd replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to