[jira] Updated: (MAPREDUCE-318) Refactor reduce shuffle code

Jothi Padmanabhan (JIRA) Thu, 20 Aug 2009 09:36:38 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jothi Padmanabhan updated MAPREDUCE-318:
----------------------------------------

    Attachment: mapred-318-20Aug.patch

Some more modifications to the previous patch

# The shuffle status on the web UI -- the number of maps being copied and the 
bandwidth -- is now present. This will make this similar to the current trunk 
status
# Modified the condition for triggering on disk merge as numfies > 
(2*iosortfactor - 1), similar to the current trunk code. This ensures we do 
merges a little less agressively
# Modified to trigger an memory merge on stall. We cannot pull the trigger only 
on the cross of memory threshold, that has a possibility of hang when several 
fetchers just return if there is not sufficient memory for the current map 
output, but the total memory used has not crossed the threshold. 

> Refactor reduce shuffle code
> ----------------------------
>
>                 Key: MAPREDUCE-318
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-318
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: HADOOP-5233_api.patch, HADOOP-5233_part0.patch, 
> mapred-318-14Aug.patch, mapred-318-20Aug.patch, mapred-318-common.patch
>
>
> The reduce shuffle code has become very complex and entangled. I think we 
> should move it out of ReduceTask and into a separate package 
> (org.apache.hadoop.mapred.task.reduce). Details to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-318) Refactor reduce shuffle code

Reply via email to