[ 
https://issues.apache.org/jira/browse/HDFS-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526914#comment-15526914
 ] 

Manoj Govindassamy commented on HDFS-9850:
------------------------------------------

With the proposed fix in this jira issue HDFS-9850, both 
{{DiskBalancerMover#copyBlocks}} and {{DiskBalancer#queryWorkStatus}} handles 
the case where any of the involved storage volumes in the plan are no more 
available.  

* {{copyBlocks}} logs the error about missing volume and returns without 
crashing/exception
** hence that particular {{DiskBalancerWorkItem}} thus gets skipped and moves 
on to the next one
** But, if the step happens to be the last one, then the currentResult gets 
stuck at {{PLAN_UNDER_PROGRESS}} even if it returned with error
* {{queryWorkStatus}} also logs the error about missing volume and throws 
{{DiskBalancerException}} INTERNAL_ERROR. 
** {{QueryCommand}} which invokes queryWorkStatus gets the exception, and logs 
the error query plan failed
** May be we should also print out something to the user so that he gets to see 
some meaningful error message

As part of fixing this jira, I have also added a unit  test case 
{{TestDiskBalancer#testDiskBalancerWhenRemovingVolumes}} for the exact above 
case. For now, the test code verifies that DiskBalancer is not crashing or 
throwing weird exceptions. 

To attack both above cases, jira HDFS-10904 has been filed so that user gets to 
see proper error message and proper balancing operation state even when the 
involved volumes are removed. And btw, this corner case was already there and 
not introduced by this jira. This jira attempts to address unnecessary 
FsVolumeSpi object references issue. Please let me know if you need more 
clarifications.

> DiskBalancer : Explore removing references to FsVolumeSpi 
> ----------------------------------------------------------
>
>                 Key: HDFS-9850
>                 URL: https://issues.apache.org/jira/browse/HDFS-9850
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: balancer & mover
>    Affects Versions: 3.0.0-alpha2
>            Reporter: Anu Engineer
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-9850.001.patch, HDFS-9850.002.patch, 
> HDFS-9850.003.patch
>
>
> In HDFS-9671, [~arpitagarwal] commented that we should explore the 
> possibility of removing references to FsVolumeSpi at any point and only deal 
> with storage ID. We are not sure if this is possible, this JIRA is to explore 
> if that can be done without issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to