[
https://issues.apache.org/jira/browse/HADOOP-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616647#action_12616647
]
Christian Kunz commented on HADOOP-2771:
----------------------------------------
io.sort.mb = 200.
The average map output is 3-4 GB.
Your suspicion about time difference in the final merge might be correct. I
checked one task, which spent about 12 minutes in the final merge with 9000
reduces, but close to 30 minutes with 18000 reduces (the execution time of the
map itself was basically the same).
> changing the number of reduces dramatically changes the time of the map time
> ----------------------------------------------------------------------------
>
> Key: HADOOP-2771
> URL: https://issues.apache.org/jira/browse/HADOOP-2771
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.17.1
> Reporter: Owen O'Malley
>
> By changing the number of reduces, the time for an individual map changes
> radically. By running the same program and data with different numbers of
> reduces (2500, 7500, 25000) the times for each map changed radically (0:50,
> 1:20, 5h).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.