[ https://issues.apache.org/jira/browse/HDFS-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526914#comment-15526914 ]
Manoj Govindassamy commented on HDFS-9850: ------------------------------------------ With the proposed fix in this jira issue HDFS-9850, both {{DiskBalancerMover#copyBlocks}} and {{DiskBalancer#queryWorkStatus}} handles the case where any of the involved storage volumes in the plan are no more available. * {{copyBlocks}} logs the error about missing volume and returns without crashing/exception ** hence that particular {{DiskBalancerWorkItem}} thus gets skipped and moves on to the next one ** But, if the step happens to be the last one, then the currentResult gets stuck at {{PLAN_UNDER_PROGRESS}} even if it returned with error * {{queryWorkStatus}} also logs the error about missing volume and throws {{DiskBalancerException}} INTERNAL_ERROR. ** {{QueryCommand}} which invokes queryWorkStatus gets the exception, and logs the error query plan failed ** May be we should also print out something to the user so that he gets to see some meaningful error message As part of fixing this jira, I have also added a unit test case {{TestDiskBalancer#testDiskBalancerWhenRemovingVolumes}} for the exact above case. For now, the test code verifies that DiskBalancer is not crashing or throwing weird exceptions. To attack both above cases, jira HDFS-10904 has been filed so that user gets to see proper error message and proper balancing operation state even when the involved volumes are removed. And btw, this corner case was already there and not introduced by this jira. This jira attempts to address unnecessary FsVolumeSpi object references issue. Please let me know if you need more clarifications. > DiskBalancer : Explore removing references to FsVolumeSpi > ---------------------------------------------------------- > > Key: HDFS-9850 > URL: https://issues.apache.org/jira/browse/HDFS-9850 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: balancer & mover > Affects Versions: 3.0.0-alpha2 > Reporter: Anu Engineer > Assignee: Manoj Govindassamy > Attachments: HDFS-9850.001.patch, HDFS-9850.002.patch, > HDFS-9850.003.patch > > > In HDFS-9671, [~arpitagarwal] commented that we should explore the > possibility of removing references to FsVolumeSpi at any point and only deal > with storage ID. We are not sure if this is possible, this JIRA is to explore > if that can be done without issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org