[ https://issues.apache.org/jira/browse/HDFS-7722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351218#comment-14351218 ]
Lei (Eddy) Xu commented on HDFS-7722: ------------------------------------- [~cmccabe] Thanks for the review. I will make a patch to address your comments. [~cnauroth] Yes, you are right on this one. Sure, I believe we can hold committing this. A review from you early next week would be much appreciated! To add some background, the rationale of this patch is providing user a convenient way to fix bad disks without touching configuration files, in the meantime, also preserving disk failure information for reporting purpose. bq. I would like us to have some means to take corrective action and clear the volume failure information "online". For this concern, I suggest to have a following JIRA to let {{DataNode#parseChangedVolume}} to detect volumes that * is not in {{FsVolumeList}} * is not in {{DFS_DATANODE_DATA_DIR_KEYS}} * and is in {{volumeFailureInfos}} as {{DataNode#ChangedVolumes#deactiveLocations}}. So that the following logic can clear this failure info if the user _intents_ to do so. > DataNode#checkDiskError should also remove Storage when error is found. > ----------------------------------------------------------------------- > > Key: HDFS-7722 > URL: https://issues.apache.org/jira/browse/HDFS-7722 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.6.0 > Reporter: Lei (Eddy) Xu > Assignee: Lei (Eddy) Xu > Attachments: HDFS-7722.000.patch, HDFS-7722.001.patch, > HDFS-7722.002.patch > > > When {{DataNode#checkDiskError}} found disk errors, it removes all block > metadatas from {{FsDatasetImpl}}. However, it does not removed the > corresponding {{DataStorage}} and {{BlockPoolSliceStorage}}. > The result is that, we could not directly run {{reconfig}} to hot swap the > failure disks without changing the configure file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)