[ 
https://issues.apache.org/jira/browse/HDFS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981732#comment-15981732
 ] 

Ruslan Dautkhanov commented on HDFS-8131:
-----------------------------------------

Thanks for this great improvement! 
When using AvailableSpaceBlockPlacementPolicy, the default below logic does not 
work anymore?
{quote}
1. Place the first replica somewhere – either a random rack and node (if the 
HDFS client is outside the hadoop cluster) or on the local node (if the HDFS 
client is running on a node inside the cluster).
2. The second replica is written to a different rack from the first, chosen at 
random.
3. The third replica is written to the same rack as the second replica, but on 
a different node.
4. If there are more replicas – spread them across the rest of the racks.
{quote}
What is this logic now? When it comes to rackawareness and such? 
Is it by pure available space and rack awareness logic doesn't kick in?


> Implement a space balanced block placement policy
> -------------------------------------------------
>
>                 Key: HDFS-8131
>                 URL: https://issues.apache.org/jira/browse/HDFS-8131
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Liu Shaohui
>            Assignee: Liu Shaohui
>            Priority: Minor
>              Labels: BlockPlacementPolicy
>             Fix For: 2.8.0, 3.0.0-alpha1
>
>         Attachments: balanced.png, HDFS-8131.004.patch, HDFS-8131.005.patch, 
> HDFS-8131.006.patch, HDFS-8131-v1.diff, HDFS-8131-v2.diff, HDFS-8131-v3.diff
>
>
> The default block placement policy will choose datanodes for new blocks 
> randomly, which will result in unbalanced space used percent among datanodes 
> after an cluster expansion. The old datanodes always are in high used percent 
> of space and new added ones are in low percent.
> Through we can used the external balance tool to balance the space used rate, 
> it will cost extra network IO and it's not easy to control the balance speed.
> An easy solution is to implement an balanced block placement policy which 
> will choose low used percent datanodes for new blocks with a little high 
> possibility. In a not long term, the used percent of datanodes will trend to 
> be balanced.
> Suggestions and discussions are welcomed. Thanks



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to