[
https://issues.apache.org/jira/browse/HDFS-9819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151580#comment-15151580
]
Lin Yiqun commented on HDFS-9819:
---------------------------------
Overall, one time check-dir failed is not always indicated the volume failed.
One accidental deletion, intended deletion or accidental dir/file permission
changing are all the external reason to lead check-dir failed.
> FsVolume should tolerate few times check-dir failed due to deletion by mistake
> ------------------------------------------------------------------------------
>
> Key: HDFS-9819
> URL: https://issues.apache.org/jira/browse/HDFS-9819
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Lin Yiqun
> Assignee: Lin Yiqun
> Fix For: 2.7.1
>
> Attachments: HDFS-9819.001.patch
>
>
> FsVolume should tolerate few times check-dir failed because sometimes we will
> do a delete dir/file operation by mistake in datanode data-dirs. Then the
> {{DataNode#startCheckDiskErrorThread}} will invoking checkDir method
> periodicity and find dir not existed, throw exception. The checked volume
> will be added to failed volume list. The blocks on this volume will be
> replicated again. But actually, this is not needed to do. We should let
> volume can be tolerated few times check-dir failed like config
> {{dfs.datanode.failed.volumes.tolerated}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)