[ 
https://issues.apache.org/jira/browse/HDFS-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578019#comment-14578019
 ] 

Colin Patrick McCabe commented on HDFS-8549:
--------------------------------------------

So I guess the idea here is that running the balancer while we still have the 
"previous" directory around doesn't actually make sense most of the time.  When 
the balancer tries to "move" a block, it copies it to the new datanode and then 
deletes it from the current/ directory of the old datanode.  But when the 
"previous" directory is around, it won't really be deleted from the source 
node.  So the space consumption will not decrease on the source node.

Arguably, running the balancer might still serve a useful function in bringing 
data to the nodes where it needs to be to be more balanced.  When we finish the 
upgrade, we can then delete all the duplicates when we delete the "previous" 
directories.

So, I think your change makes sense as the default behavior, but I think we 
might want a flag to provide a manual override.  Can you provide a "force" flag 
so that we can run the balancer even during an upgrade?

+1 once that's added

> Abort the balancer if an upgrade is in progress
> -----------------------------------------------
>
>                 Key: HDFS-8549
>                 URL: https://issues.apache.org/jira/browse/HDFS-8549
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>    Affects Versions: 2.7.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: HDFS-8549.001.patch, HDFS-8549.002.patch
>
>
> Running the balancer during an ongoing upgrade has a negative affect, since 
> DNs do not actually delete blocks. This means the balancer is making lots of 
> extra replicas and not actually reducing the disk utilization of 
> over-utilized nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to