[
https://issues.apache.org/jira/browse/MAPREDUCE-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Joseph Evans resolved MAPREDUCE-2684.
--------------------------------------------
Resolution: Duplicate
> Job Tracker can starve reduces with very large input.
> -----------------------------------------------------
>
> Key: MAPREDUCE-2684
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2684
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobtracker
> Affects Versions: 0.20.204.0
> Reporter: Robert Joseph Evans
> Assignee: Robert Joseph Evans
>
> If mapreduce.reduce.input.limit is mis-configured or if a cluster is just
> running low on disk space in general then reduces with large a input may
> never get scheduled causing the Job to never fail and never succeed, just
> starve until the job is killed.
> The JobInProgess tries to guess at the size of the input to all reducers in a
> job. If the size is over mapreduce.reduce.input.limit then the job is
> killed. If it is not then findNewReduceTask() checks to see if the estimated
> size is too big to fit on the node currently looking for work. If it is not
> then it will let some other task have a chance at the slot.
> The idea is to keep track of how often it happens that a Reduce Slot is
> rejected because of the lack of space vs how often it succeeds and then guess
> if the reduce tasks will ever be scheduled.
> So I would like some feedback on this.
> 1) How should we guess. Someone who found the bug here suggested P1 + (P2 *
> S), where S is the number of successful assignments. Possibly P1 = 20 and P2
> = 2.0. I am not really sure.
> 2) What should we do when we guess that it will never get a slot? Should we
> fail the job or do we say, even though it might fail, well lets just schedule
> the it and see if it really will fail.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira