[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065249#comment-13065249
 ] 

Robert Joseph Evans commented on MAPREDUCE-2324:
------------------------------------------------

I just found the same issue and I am looking into what is the best way to solve 
it.

If mapreduce.reduce.input.limit is mis-configured or if a cluster is just 
running low on disk space in general then reduces with large a input may never 
get scheduled causing the Job to never fail and never succeed, just starve 
until the job is killed.

The JobInProgess tries to guess at the size of the input to all reducers in a 
job. If the size is over mapreduce.reduce.input.limit then the job is killed. 
If it is not then findNewReduceTask() checks to see if the estimated size is 
too big to fit on the node currently looking for work. If it is not then it 
will let some other task have a chance at the slot.

The idea is to keep track of how often it happens that a Reduce Slot is 
rejected because of the lack of space vs how often it succeeds and then guess 
if the reduce tasks will ever be scheduled.

So I would like some feedback on this.

1) How should we guess. Someone who found the bug here suggested P1 + (P2 * S), 
where S is the number of successful assignments. Possibly P1 = 20 and P2 = 2.0. 
I am not really sure.
2) What should we do when we guess that it will never get a slot? Should we 
fail the job or do we say, even though it might fail, well lets just schedule 
the it and see if it really will fail.


> Job should fail if a reduce task can't be scheduled anywhere
> ------------------------------------------------------------
>
>                 Key: MAPREDUCE-2324
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2324
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: Todd Lipcon
>            Assignee: Robert Joseph Evans
>
> If there's a reduce task that needs more disk space than is available on any 
> mapred.local.dir in the cluster, that task will stay pending forever. For 
> example, we produced this in a QA cluster by accidentally running terasort 
> with one reducer - since no mapred.local.dir had 1T free, the job remained in 
> pending state for several days. The reason for the "stuck" task wasn't clear 
> from a user perspective until we looked at the JT logs.
> Probably better to just fail the job if a reduce task goes through all TTs 
> and finds that there isn't enough space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to