Hi Gregory,

Of what I recall, there was a discussion where it was speculated to have 
speculative execution for reducers alone, as having it for map side had 
concerns.
Though there might be two config params now if you want to do it - 
mapred.reduce.tasks.speculative.execution/ 
mapred.map.tasks.speculative.execution

On reporting side, when you say its incorrect, by what margin?like how many map 
jobs are still in running state when 100% is reported?
I think there was a marginal error on rounding side sometime..Can you verify if 
its a side-effect of speculative by trying the run without speculative mode on?

Cheers,
/R

On 5/28/10 2:07 AM, "Gregory Lawrence" <[email protected]> wrote:

Hi,

Does anybody know whether or not speculative execution works with Hadoop 
streaming?

If so, I have a script that does not appear to ever launch redundant mappers 
for the slow performers. This may be due to the fact that each mapper quickly 
reports (inaccurately) that it is 100% complete. I am using the 
NLineInputFormat and each mapper gets 17 lines of input. Each line requires a 
lot of computation. It appears that all 17 lines immediately get counted as 
being processed early on. Is there anyway to report or force accurate 
completion stats? Could this explain why speculative execution never gets 
triggered?

Thanks,
Greg Lawrence

Reply via email to