[
https://issues.apache.org/jira/browse/HADOOP-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Devaraj Das updated HADOOP-3297:
--------------------------------
Attachment: 3297.patch
I ran a benchmark (loadgen) with the attached patch. Here are the details:
1) Num maps - 10000
2) Size of each map output - 1KB
3) Size of cluster - 80 nodes
4) Num reducers - 1
With the patch, the run took ~7 minutes. On trunk, the same job took ~11
minutes.
> The way in which ReduceTask/TaskTracker gets completion events during shuffle
> can be improved
> ---------------------------------------------------------------------------------------------
>
> Key: HADOOP-3297
> URL: https://issues.apache.org/jira/browse/HADOOP-3297
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Devaraj Das
> Assignee: Devaraj Das
> Fix For: 0.18.0
>
> Attachments: 3297.patch
>
>
> Certain things like poll frequency, number of events fetched in one go, etc.
> can probably be improved to improve the shuffle performance. This would
> affect the task-->tasktracker and the tasktracker-->jobtracker shuffle
> related RPCs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.