[ 
https://issues.apache.org/jira/browse/HDFS-9819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15152427#comment-15152427
 ] 

Kihwal Lee commented on HDFS-9819:
----------------------------------

checkdir is essentially permission and ROFS check. As far as I can see, there 
is no such thing as legitimate "transient" checkdir failures.

> FsVolume should tolerate few times check-dir failed due to deletion by mistake
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-9819
>                 URL: https://issues.apache.org/jira/browse/HDFS-9819
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.1
>            Reporter: Lin Yiqun
>            Assignee: Lin Yiqun
>         Attachments: HDFS-9819.001.patch
>
>
> FsVolume should tolerate few times check-dir failed because sometimes we will 
> do a delete dir/file operation by mistake in datanode data-dirs. Then the 
> {{DataNode#startCheckDiskErrorThread}} will invoking checkDir method 
> periodicity and find dir not existed, throw exception. The checked volume 
> will be added to failed volume list. The blocks on this volume will be 
> replicated again. But actually, this is not needed to do. We should let 
> volume can be tolerated few times check-dir failed like config 
> {{dfs.datanode.failed.volumes.tolerated}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to