[jira] Commented: (HADOOP-2771) changing the number of reduces dramatically changes the time of the map time

Christian Kunz (JIRA) Sat, 22 Nov 2008 20:04:08 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649992#action_12649992
 ]


Christian Kunz commented on HADOOP-2771:
----------------------------------------

Checked similar job in hadoop-0.18.1 with block compression of transient data 
turned on.

Merge-sort of map spills still depends strongly on number of reduces, but less 
than in earlier releases.

On the average a map task took:
1hr 13min with 9000 reduces
1hr 19min with 18000 reduces
Of this time on the average 51 minutes were taken up by the application, i.e. 
the merge-sort of the map spills increased from 22min to 28min when doubling 
the number of reduces.

Overall the time spent in merge-sort of the map spills increased because of 
compression (before 0.18 block compression of transient data could not be used 
at that scale), but the dependence on the number of reduces decreased

> changing the number of reduces dramatically changes the time of the map time
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-2771
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2771
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.1
>            Reporter: Owen O'Malley
>             Fix For: 0.20.0
>
>
> By changing the number of reduces, the time for an individual map changes 
> radically. By running the same program and data with different numbers of 
> reduces (2500, 7500, 25000) the times for each map changed radically (0:50, 
> 1:20, 5h).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2771) changing the number of reduces dramatically changes the time of the map time

Reply via email to