[ 
http://issues.apache.org/jira/browse/HADOOP-152?page=comments#action_12383306 ] 

Bryan Pendleton commented on HADOOP-152:
----------------------------------------

*bump*

Is anyone else seeing this problem? My cluster is pretty unevenly loaded, and, 
without speculative execution, I'm waiting for very long times for tasks to 
timeout on short jobs. Speculative execution is enabled, so there's no reason 
that, say, two maps out of ~1900 should be holding up execution. I suspect the 
"progress" accounting being done in the Job isn't being done correctly.

But, even then, perhaps we need more metrics - with the current metrics, if one 
of the job units happens to be running really slowly on a given node, but might 
run faster on other nodes, it might never get executed on another node because 
the progress on the slow node might be reported as close enough to done so as 
to not trip the speculative execution.

> Speculative tasks not being scheduled
> -------------------------------------
>
>          Key: HADOOP-152
>          URL: http://issues.apache.org/jira/browse/HADOOP-152
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.2
>  Environment: ~30 node Opteron cluster
>     Reporter: Bryan Pendleton
>     Priority: Minor

>
> The criteria for starting up a speculative task includes a check that the 
> "average progress"-"progress" > the speculative gap, currently 0.2.
> I don't know if this is the right metric, but it doesn't seem to be correctly 
> calculated. I've regularly seen the "average progress" with values of less 
> than 0.01, while the "progress" value is showing in the range .90-1.0, and 
> yet, still no speculative tasks are started up. This has caused at least one 
> long-running task to run about 10% longer while overloaded hosts catch up.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to