[ 
https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7722:
--------------------------------
    Attachment: HDFS-7722.002.patch

Updated the patch to address {{TestDataNodeVolumeFailureReporting}} failures.

Hi, [~cnauroth], I found that in 
{{TestDataNodeVolumeFailureReporting#testDataNodeReconfigureWithVolumeFailures}},
 you assumed that removing a volume can clear the failed volume info.  However, 
this patch assumes that a volume will be removed completely when {{checkDirs}} 
finding an error, while the {{VolumeFailureInfo}} is kept for reporting 
purpose. 

* The pros are that: user can directly run {{-reconfig}} to load a new disk 
without changing {{dfs.data.dirs}}. 
* The cons are that: as shown in your test, we can not use {-reconfig} to clear 
the {{VolumeFailureInfo}}, since we can not find this volume from 
{{DataNode#parseChangedVolumes()}}.

Does it make sense to you? Would you mind to share your options?

Thanks!



> DataNode#checkDiskError should also remove Storage when error is found.
> -----------------------------------------------------------------------
>
>                 Key: HDFS-7722
>                 URL: https://issues.apache.org/jira/browse/HDFS-7722
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.6.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, 
> HDFS-7722.002.patch
>
>
> When {{DataNode#checkDiskError}} found disk errors, it removes all block 
> metadatas from {{FsDatasetImpl}}. However, it does not removed the 
> corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. 
> The result is that, we could not directly run {{reconfig}} to hot swap the 
> failure disks without changing the configure file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to