[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113202#comment-13113202
 ] 

Ravi Gummadi commented on MAPREDUCE-3077:
-----------------------------------------

My earlier suggestion was to do the check of 'bad disks becoming good' only in 
TT.initialize(). But the check can be done more frequently --- may be once in 
300 sec or so in TT.offerService() as you suggested --- and this can cause 
TT-reinit.

> re-enable faulty TaskTracker storage without restarting TT, when appropriate
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3077
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3077
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: tasktracker
>    Affects Versions: 0.20.205.0
>            Reporter: Matt Foley
>
> In MAPREDUCE-2928, Ravi Gummadi proposed:
> bq. we can add LocalStorage.checkBadLocalDirs() call to TT.initialize() that 
> can do disk-health-check of bad local dirs and add dirs to the good local 
> dirs list if they become good.
> and Eli Collins added:
> bq. Sounds good. Since transient disk failures may cause a file system to 
> become read-only (causing permanent failures) sometimes re-mounting is 
> sufficient to recover in which case it makes sense to re-enable faulty disks 
> w/o TT restart.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to