[
https://issues.apache.org/jira/browse/MAPREDUCE-7081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhizhen Hou resolved MAPREDUCE-7081.
------------------------------------
Resolution: Invalid
> Default speculator won't speculate the last several submitted reduced task if
> the total task num is large
> ---------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-7081
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7081
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mrv2
> Affects Versions: 2.9.0, 2.7.5
> Reporter: Zhizhen Hou
> Priority: Major
>
> DefaultSpeculator speculates a task one time. By default, the number of
> speculators is max(max(10, 0.01 * tasks.size), 0.1 * running tasks).
> I set mapreduce.job.reduce.slowstart.completedmaps = 1 to start reduce after
> all the map tasks are finished. The cluster has 1000 vcores, and the Job has
> 5000 reduce jobs. At first, 1000 reduces tasks can run simultaneously, number
> of speculators can speculator at most is 0.1 * 1000 = 100 tasks. Reduce tasks
> with less data can over shortly, and speculator will speculator a task per
> second by default. The task be speculated execution may be because the more
> data to be processed. It will speculator 100 tasks within 100 seconds. When
> 4900 reduces is over, If a reduce is executed with a lot of data be
> processed and is put on a slow machine. The speculate opportunity is running
> out, it will not be speculated. It can increase the execution time of job
> significantly.
> In short, it may waste the speculate opportunity at first only because the
> execution time of reduce with less data to be processed as average time. At
> end of job, there is no speculate opportunity available, especially last
> several running tasks, judged the number of the running tasks .
> In my opinion, the number of running tasks should not determine the number of
> speculate opportunity .The number of tasks be speculated can be judged by
> square of finished task percent. Take an example, if ninety percent of the
> task is finished, only 0.9*0.9 = 0.81 speculate opportunity can be used. It
> will leave enough opportunity for latter tasks.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]