[ https://issues.apache.org/jira/browse/HDFS-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16570936#comment-16570936 ]
Anu Engineer commented on HDFS-13728: ------------------------------------- [~xiaochen] I am +1 too, for the lack of better alternative :(. DiskBalancer is just exposing a latent bug in HDFS stack, but we don't have to chase it down if disk balancer is the only entity worried about it. > Disk Balancer should not fail if volume usage is greater than capacity > ---------------------------------------------------------------------- > > Key: HDFS-13728 > URL: https://issues.apache.org/jira/browse/HDFS-13728 > Project: Hadoop HDFS > Issue Type: Improvement > Components: diskbalancer > Affects Versions: 3.0.3 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Minor > Attachments: HDFS-13728.001.patch > > > We have seen a couple of scenarios where the disk balancer fails because a > datanode reports more spaced used on a disk than its capacity, which should > not be possible. > This is due to the check below in DiskBalancerVolume.java: > {code} > public void setUsed(long dfsUsedSpace) { > Preconditions.checkArgument(dfsUsedSpace < this.getCapacity(), > "DiskBalancerVolume.setUsed: dfsUsedSpace(%s) < capacity(%s)", > dfsUsedSpace, getCapacity()); > this.used = dfsUsedSpace; > } > {code} > While I agree that it should not be possible for a DN to report more usage on > a volume than its capacity, there seems to be some issue that causes this to > occur sometimes. > In general, this full disk is what causes someone to want to run the Disk > Balancer, only to find it fails with the error. > There appears to be nothing you can do to force the Disk Balancer to run at > this point, but in the scenarios I saw, some data was removed from the disk > and usage dropped below the capacity resolving the issue. > Can we considered relaxing the above check, and if the usage is greater than > the capacity, just set the usage to the capacity so the calculations all work > ok? > Eg something like this: > {code} > public void setUsed(long dfsUsedSpace) { > - Preconditions.checkArgument(dfsUsedSpace < this.getCapacity()); > - this.used = dfsUsedSpace; > + if (dfsUsedSpace > this.getCapacity()) { > + this.used = this.getCapacity(); > + } else { > + this.used = dfsUsedSpace; > + } > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org