[ 
https://issues.apache.org/jira/browse/MAPREDUCE-268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan resolved MAPREDUCE-268.
-----------------------------------------

    Resolution: Duplicate

This has been committed as a part of MAPREDUCE-318

> Implement memory-to-memory merge in the reduce
> ----------------------------------------------
>
>                 Key: MAPREDUCE-268
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-268
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>
> HADOOP-3446 fixed the reduce to not flush the in-memory shuffled map-outputs 
> before feeding to the reduce. However for latency-sensitive applications with 
> lots of memory like the terasort this hurts performance since the fan-in for 
> the final in-memory merge is too large (all 8000 map-outputs very in-memory) 
> resulting in less than optimal performance.
> When I put in an intermediate memory-to-memory merge for the terasort's 
> reduce (there-by avoiding disk i/o) to cut the fan-in from 8000 to <100 the 
> 'reduce' phase (including the local datanode-write) sped-up 250% (from 10s to 
> 4s). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to