[
https://issues.apache.org/jira/browse/MAPREDUCE-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113202#comment-13113202
]
Ravi Gummadi commented on MAPREDUCE-3077:
-----------------------------------------
My earlier suggestion was to do the check of 'bad disks becoming good' only in
TT.initialize(). But the check can be done more frequently --- may be once in
300 sec or so in TT.offerService() as you suggested --- and this can cause
TT-reinit.
> re-enable faulty TaskTracker storage without restarting TT, when appropriate
> ----------------------------------------------------------------------------
>
> Key: MAPREDUCE-3077
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3077
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: tasktracker
> Affects Versions: 0.20.205.0
> Reporter: Matt Foley
>
> In MAPREDUCE-2928, Ravi Gummadi proposed:
> bq. we can add LocalStorage.checkBadLocalDirs() call to TT.initialize() that
> can do disk-health-check of bad local dirs and add dirs to the good local
> dirs list if they become good.
> and Eli Collins added:
> bq. Sounds good. Since transient disk failures may cause a file system to
> become read-only (causing permanent failures) sometimes re-mounting is
> sufficient to recover in which case it makes sense to re-enable faulty disks
> w/o TT restart.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira