[jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.

Lei (Eddy) Xu (JIRA) Fri, 06 Mar 2015 16:30:07 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351218#comment-14351218
 ]


Lei (Eddy) Xu commented on HDFS-7722:
-------------------------------------

[~cmccabe] Thanks for the review. I will make a patch to address your comments.

[~cnauroth] Yes, you are right on this one. Sure, I believe we can hold 
committing this. A review from you early next week would be much appreciated! 

To add some background, the rationale of this patch is providing user a 
convenient way to fix bad disks without touching configuration files, in the 
meantime, also preserving disk failure information for reporting purpose. 

bq.  I would like us to have some means to take corrective action and clear the 
volume failure information "online". 

For this concern, I suggest to have a following JIRA to let 
{{DataNode#parseChangedVolume}} to detect volumes that
* is not in {{FsVolumeList}}
* is not in {{DFS_DATANODE_DATA_DIR_KEYS}}
* and is in {{volumeFailureInfos}}

as {{DataNode#ChangedVolumes#deactiveLocations}}. So that the following logic 
can clear this failure info if the user _intents_ to do so.


> DataNode#checkDiskError should also remove Storage when error is found.
> -----------------------------------------------------------------------
>
>                 Key: HDFS-7722
>                 URL: https://issues.apache.org/jira/browse/HDFS-7722
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, 
> HDFS-7722.002.patch
>
>
> When {{DataNode#checkDiskError}} found disk errors, it removes all block 
> metadatas from {{FsDatasetImpl}}. However, it does not removed the 
> corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. 
> The result is that, we could not directly run {{reconfig}} to hot swap the 
> failure disks without changing the configure file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7722) DataNode#checkDiskError should also remove Storage when error is found.

Reply via email to