[ https://issues.apache.org/jira/browse/MAPREDUCE-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jason Lowe resolved MAPREDUCE-7080. ----------------------------------- Resolution: Duplicate Closing as a duplicate of MAPREDUCE-7081. > Default speculator won't sepculate the last several submitted reduced task if > the total task num is large > --------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-7080 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7080 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mrv2 > Affects Versions: 2.7.5 > Reporter: Zhizhen Hou > Priority: Major > > DefaultSpeculator speculates a task one time. > By default, the number of speculators is max(max(10, 0.01 * tasks.size), 0.1 > * running tasks) > I set mapreduce.job.reduce.slowstart.completedmaps = 1 to start reduce after > all the map tasks are finished. > The cluster has 1000 vcores, and the Job has 5000 reduce jobs. > At first, 1000 reduces tasks can run simultaneously, number of speculators > can speculator at most is 0.1 * 1000 = 100 tasks. Reduce tasks with less data > can over shortly, and speculator will speculator a task per second by > default. The task be speculated execution may be because the more data to be > processed. It will speculator 100 tasks within 100 seconds. > When 4900 reduces is over, If a reduce is executed with a lot of data be > processed and is put on a slow machine. The speculate opportunity is running > out, it will not be speculated. It can increase the execution time of job > significantly. > In short, it may waste the speculate opportunity at first only because the > execution time of reduce with less data to be processed as average time. At > end of job, there is no speculate opportunity available, especially last > several running tasks, judged the number of the running tasks . > > In my opinion, the number of tasks be speculated can be judged by square of > finished task percent. Take an example, if ninety percent of the task is > finished, only 0.9*0.9 = 0.81 speculate opportunity can be used. It will > leave enough opportunity for latter tasks. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org