[jira] [Updated] (MAPREDUCE-4299) Terasort hangs with MR2 FifoScheduler

Tom White (JIRA) Tue, 10 Jul 2012 18:13:37 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tom White updated MAPREDUCE-4299:
---------------------------------

    Attachment: MAPREDUCE-4299.patch

The problem is that FifoScheduler always sets the application headroom to be 
the entire set of cluster resources, without taking into account any containers 
that have been assigned. In some cases, like the terasort case mentioned in the 
JIRA, this leads to the reducer tasks using all the cluster resources before 
the map tasks have finished, resulting in deadlock.

Attached is a fix with a unit test.
                
> Terasort hangs with MR2 FifoScheduler
> -------------------------------------
>
>                 Key: MAPREDUCE-4299
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4299
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.0.0-alpha
>            Reporter: Tom White
>         Attachments: MAPREDUCE-4299.patch
>
>
> What happens is that the number of reducers ramp up until they occupy all of 
> the job's containers, at which point the maps no longer make any progress and 
> the job hangs.
> When the same job is run with the CapacityScheduler it succeeds, so this 
> looks like a FifoScheduler bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4299) Terasort hangs with MR2 FifoScheduler

Reply via email to