turn off speculative execution. but your map tasks should be idempotent. If they are not, rethink. Speculative execution is a good thing (and so is preemption, it's eviller twin).
-D On Tue, Feb 9, 2010 at 6:52 PM, prasenjit mukherjee <[email protected]> wrote: > Any thoughts on this problem ? I am using a DEFINE command ( in PIG ) > and hence the actions are not idempotent. Because of which duplicate > execution does have an affect on my results. Any way to overcome that > ? > > On Tue, Feb 9, 2010 at 9:26 PM, prasenjit mukherjee > <[email protected]> wrote: >> But the second attempted job got killed even before the first one was >> completed. How can we explain that. >> >> On Tue, Feb 9, 2010 at 7:38 PM, Eric Sammer <[email protected]> wrote: >>> Prasen: >>> >>> This is most likely speculative execution. Hadoop fires up multiple >>> attempts for the same task and lets them "race" to see which finishes >>> first and then kills the others. This is meant to speed things along. >>> >>> Speculative execution is on by default, but can be disabled. See the >>> configuration reference for mapred-*.xml. >>> >>> On 2/9/10 9:03 AM, prasenjit mukherjee wrote: >>>> Sometimes for the same task I see that a duplicate task gets run on a >>>> different machine and gets killed later. Not always but sometimes. Any >>>> reason why duplicate tasks get run. I thought tasks are duplicated >>>> only if either the first attempt exits( exceptions etc ) or exceeds >>>> mapred.task.timeout. In this case none of them happens. As can be seen >>>> from timestamp, the second attempt starts even though the first >>>> attempt is still running ( only for 1 minute ). >>>> >>>> Any explanation ? >>>> >>>> attempt_201002090552_0009_m_000001_0 >>>> /default-rack/ip-10-242-142-193.ec2.internal >>>> SUCCEEDED >>>> 100.00% >>>> 9-Feb-2010 07:04:37 >>>> 9-Feb-2010 07:07:00 (2mins, 23sec) >>>> >>>> attempt_201002090552_0009_m_000001_1 >>>> Task attempt: /default-rack/ip-10-212-147-129.ec2.internal >>>> Cleanup Attempt: /default-rack/ip-10-212-147-129.ec2.internal >>>> KILLED >>>> 100.00% >>>> 9-Feb-2010 07:05:34 >>>> 9-Feb-2010 07:07:10 (1mins, 36sec) >>>> >>>> -Prasen >>>> >>> >>> >>> -- >>> Eric Sammer >>> [email protected] >>> http://esammer.blogspot.com >>> >> >
