[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-7080.
-----------------------------------
    Resolution: Duplicate

Closing as a duplicate of MAPREDUCE-7081.

> Default speculator won't sepculate the last several submitted reduced task if 
> the total task num is large
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7080
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7080
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 2.7.5
>            Reporter: Zhizhen Hou
>            Priority: Major
>
> DefaultSpeculator speculates a task one time. 
> By default, the number of speculators is max(max(10, 0.01 * tasks.size), 0.1 
> * running tasks)
> I  set mapreduce.job.reduce.slowstart.completedmaps = 1 to start reduce after 
> all the map tasks are finished.
> The cluster has 1000 vcores, and the Job has 5000 reduce jobs.
> At first, 1000 reduces tasks can run simultaneously, number of speculators 
> can speculator at most is 0.1 * 1000 = 100 tasks. Reduce tasks with less data 
> can over shortly, and speculator will speculator a task per second by 
> default. The task be speculated execution may be because the more data to be 
> processed. It will speculator  100 tasks within 100 seconds.
> When 4900 reduces is over, If a reduce is executed with a lot of  data be 
> processed and is put on a slow machine. The speculate opportunity is running 
> out, it will not be speculated. It can increase the execution time of job 
> significantly.
> In short, it may waste the speculate opportunity at first only because the 
> execution time of  reduce with less data to be processed as average time. At  
> end of job, there is no speculate opportunity available, especially last 
> several running tasks, judged the number of the running tasks .
>  
> In my opinion, the number of tasks be speculated can be judged by square of 
> finished task percent. Take an example, if ninety percent of  the task is 
> finished, only 0.9*0.9 = 0.81 speculate opportunity can be used. It will 
> leave enough opportunity for latter tasks.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to