[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12909059#action_12909059
 ] 

Joydeep Sen Sarma commented on MAPREDUCE-2062:
----------------------------------------------

another thing we have noticed is that progress rate (especially the reducer's) 
is usually pretty low (compared to mean) when the task initially starts (which 
causes lots of false speculations). However - the absolute progress rate of the 
speculated tasks is not bad at all (most of the speculated tasks had a progress 
rate that would have taken them to 100% within 3-4 minutes). 

One heuristic that seemed obvious after looking at this was that we should have 
a upper bound on the progress rate - where above that progress rate - 
speculation does not make sense (regardless of mean/stddev). The proposal is to 
be able to configure this as a 'minimum_duration' setting on mappers/reducers. 
if the mapper/reducer is projected to finish within this duration - no 
speculation will be done. setting the duration to a small number like 3-4 
minutes would weed out a lot of excessive speculators.

> speculative execution is too aggressive under certain conditions
> ----------------------------------------------------------------
>
>                 Key: MAPREDUCE-2062
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2062
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>         Environment: hadoop-20 with HADOOP-2141
>            Reporter: Joydeep Sen Sarma
>
> The function canBeSpeculated has subtle bugs that cause too much speculation 
> in certain cases.
> - it compares the current progress of the task with the last observed mean of 
> all the tasks. if only one task is in question - then the progress rate 
> decays as time progresses (in the absence of updates) and std-dev is zero. So 
> a job with a single reducer or mapper is almost always speculated.
> - is only a single task has reported progress - then the stddev is zero. so 
> other tasks may be speculated aggressively.
> - several tasks take a while to report progress initially. they seem to get 
> speculated as soon as speculative-lag is over. the lag should be configurable 
> at the minimum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to