[
https://issues.apache.org/jira/browse/MAPREDUCE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106567#comment-13106567
]
Eli Collins commented on MAPREDUCE-2921:
----------------------------------------
Wasn't the point of MAPREDUCE-2413 to "handle disk failures at both *startup*
and runtime"?
bq. If TT starts up even with single good disk/mapredLocalDir ignoring all
other bad disks, then that node can go into IO contention issues for this
single disk from all tasks — because we are not reducing the number of slots on
this TT based on bad disks.
Per MAPREDUCE-2924 we should only handle a configurable # of failures, eg you
could prevent it from starting up if only N local dirs are OK.
The same rationale applies at runtime btw! This is my point in MAPREDUCE-2413,
but you and Owen seem to be presenting the case that it's OK to have a TT
running with lots of slots and few functioning disks and no DN. I don't see why
that's not OK on startup but is OK once the TT is running.
> TaskTracker won't start with failed local directory
> ---------------------------------------------------
>
> Key: MAPREDUCE-2921
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2921
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Components: tasktracker
> Affects Versions: 0.20.204.0
> Reporter: Eli Collins
>
> Chmod'ing one of the mapred local directories so it's not executable will
> cause the TT to fail to start. Doing this after the TT has started will
> result in a TT that is up but can not execute tasks.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira