[
https://issues.apache.org/jira/browse/HDFS-8549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578019#comment-14578019
]
Colin Patrick McCabe commented on HDFS-8549:
--------------------------------------------
So I guess the idea here is that running the balancer while we still have the
"previous" directory around doesn't actually make sense most of the time. When
the balancer tries to "move" a block, it copies it to the new datanode and then
deletes it from the current/ directory of the old datanode. But when the
"previous" directory is around, it won't really be deleted from the source
node. So the space consumption will not decrease on the source node.
Arguably, running the balancer might still serve a useful function in bringing
data to the nodes where it needs to be to be more balanced. When we finish the
upgrade, we can then delete all the duplicates when we delete the "previous"
directories.
So, I think your change makes sense as the default behavior, but I think we
might want a flag to provide a manual override. Can you provide a "force" flag
so that we can run the balancer even during an upgrade?
+1 once that's added
> Abort the balancer if an upgrade is in progress
> -----------------------------------------------
>
> Key: HDFS-8549
> URL: https://issues.apache.org/jira/browse/HDFS-8549
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: balancer & mover
> Affects Versions: 2.7.0
> Reporter: Andrew Wang
> Assignee: Andrew Wang
> Attachments: HDFS-8549.001.patch, HDFS-8549.002.patch
>
>
> Running the balancer during an ongoing upgrade has a negative affect, since
> DNs do not actually delete blocks. This means the balancer is making lots of
> extra replicas and not actually reducing the disk utilization of
> over-utilized nodes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)