[ 
https://issues.apache.org/jira/browse/HADOOP-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641695#action_12641695
 ] 

Allen Wittenauer commented on HADOOP-4480:
------------------------------------------

Dhruba's makes an excellent point.  The admin definitely needs to know more 
status on the data nodes in this sort of design.

For smaller clusters, it seems like a bad thing to decommission an entire node 
when you only have one bad disk.  It would be better for the data node to just 
start decomm'ing that dir and/or stop giving block reports for that dir.

[For equal sized disks, RAID may be an alternative.  But if you have un-equal 
sized disks, RAID isn't an option, as you'll be throwing storage away.]



> data node process should not die if one dir goes bad
> ----------------------------------------------------
>
>                 Key: HADOOP-4480
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4480
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.1
>            Reporter: Allen Wittenauer
>
> When multiple directories are configured for the data node process to use to 
> store blocks, it currently exits when one of them is not writable.   Instead, 
> it should either completely ignore that directory or attempt to continue 
> reading and then marking it unusable if reads fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to