[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13106567#comment-13106567
 ] 

Eli Collins commented on MAPREDUCE-2921:
----------------------------------------

Wasn't the point of MAPREDUCE-2413 to "handle disk failures at both *startup* 
and runtime"?

bq. If TT starts up even with single good disk/mapredLocalDir ignoring all 
other bad disks, then that node can go into IO contention issues for this 
single disk from all tasks — because we are not reducing the number of slots on 
this TT based on bad disks. 

Per MAPREDUCE-2924 we should only handle a configurable # of failures, eg you 
could prevent it from starting up if only N local dirs are OK.

The same rationale applies at runtime btw! This is my point in MAPREDUCE-2413, 
but you and Owen seem to be presenting the case that it's OK to have a TT 
running with lots of slots and few functioning disks and no DN. I don't see why 
that's not OK on startup but is OK once the TT is running.

> TaskTracker won't start with failed local directory
> ---------------------------------------------------
>
>                 Key: MAPREDUCE-2921
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2921
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: tasktracker
>    Affects Versions: 0.20.204.0
>            Reporter: Eli Collins
>
> Chmod'ing one of the mapred local directories so it's not executable will 
> cause the TT to fail to start. Doing this after the TT has started will 
> result in a TT that is up but can not execute tasks. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to