[ 
https://issues.apache.org/jira/browse/HDFS-8041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393437#comment-14393437
 ] 

Nathan Roberts commented on HDFS-8041:
--------------------------------------

Hi [~kihwal]. Some minor comments on the patch
+ Can we bounds check the new config? I think it works fine even without it but 
just to be safe against a change to the algorithm in the future.
+ I wish there was a way to make this config refreshable. Unfortunately I don't 
think that's possible today. 
+ Should we protect against stats.getNumDatanodesInService being 0. Again, 
probably ok as it is today but just to avoid a future patch from breaking the 
assumptions.
+ Node local writes are not impacted by the change. Maybe we should also have 
rack-local writes avoid this check so that the 2nd and 3rd replicas remain in 
the same rack. I think just having this impact the completely random target 
selections might be enough to avoid the problem while minimizing the affects on 
block placement.

> Consider remaining space during block blockplacement if dfs space is highly 
> utilized
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-8041
>                 URL: https://issues.apache.org/jira/browse/HDFS-8041
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-8041.v1.patch, HDFS-8041.v2.patch
>
>
> This feature is helpful in avoiding smaller nodes (i.e. heterogeneous 
> environment) getting constantly being full when the overall space utilization 
> is over a certain threshold.  When the utilization is low, balancer can keep 
> up, but once the average per-node byte goes over the capacity of the smaller 
> nodes, they get full so quickly even after perfect balance.
> This jira proposes an improvement that can be optionally enabled in order to 
> slow down the rate of space usage growth of smaller nodes if the overall 
> storage utilization is over a configured threshold.  It will not replace 
> balancer, rather will help balancer keep up. Also, the primary replica 
> placement will not be affected. Only the replicas typically placed in a 
> remote rack will be subject to this check.
> The appropriate threshold is cluster configuration specific. There is no 
> generally good value to set, thus it is disabled by default. We have seen 
> cases where the threshold of 85% - 90% would help. Figuring when 
> {{totalSpaceUsed / numNodes}} becomes close to the capacity of a smaller node 
> is helpful in determining the threshold.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to