Can you confirm that duplication is happening in the case that one attempt
gets underway but killed before the other's completion?
I believe by default (though I'm not sure for Pig), each attempt's output is
first isolated to a path keyed to its attempt id, and only committed when
one and only one attempt is complete.

On Tue, Feb 9, 2010 at 9:52 PM, prasenjit mukherjee <
[email protected]> wrote:

> Any thoughts on this problem ? I am using a DEFINE command ( in PIG )
> and hence the actions are not idempotent. Because of which duplicate
> execution does have an affect on my results. Any way to overcome that
> ?
>
> On Tue, Feb 9, 2010 at 9:26 PM, prasenjit mukherjee
> <[email protected]> wrote:
> > But the second attempted job got killed even before the first one was
> > completed. How can we explain that.
> >
> > On Tue, Feb 9, 2010 at 7:38 PM, Eric Sammer <[email protected]> wrote:
> >> Prasen:
> >>
> >> This is most likely speculative execution. Hadoop fires up multiple
> >> attempts for the same task and lets them "race" to see which finishes
> >> first and then kills the others. This is meant to speed things along.
> >>
> >> Speculative execution is on by default, but can be disabled. See the
> >> configuration reference for mapred-*.xml.
> >>
> >> On 2/9/10 9:03 AM, prasenjit mukherjee wrote:
> >>> Sometimes for the same task I see that a duplicate task gets run on a
> >>> different machine and gets killed later. Not always but sometimes. Any
> >>> reason why duplicate tasks get run. I thought tasks are duplicated
> >>> only if  either the first attempt exits( exceptions etc ) or  exceeds
> >>> mapred.task.timeout. In this case none of them happens. As can be seen
> >>> from timestamp, the second attempt starts even though the first
> >>> attempt is still running ( only for 1 minute ).
> >>>
> >>> Any explanation ?
> >>>
> >>> attempt_201002090552_0009_m_000001_0
> >>>     /default-rack/ip-10-242-142-193.ec2.internal
> >>>     SUCCEEDED
> >>>     100.00%
> >>>     9-Feb-2010 07:04:37
> >>>     9-Feb-2010 07:07:00 (2mins, 23sec)
> >>>
> >>> attempt_201002090552_0009_m_000001_1
> >>>     Task attempt: /default-rack/ip-10-212-147-129.ec2.internal
> >>>     Cleanup Attempt: /default-rack/ip-10-212-147-129.ec2.internal
> >>>     KILLED
> >>>     100.00%
> >>>     9-Feb-2010 07:05:34
> >>>     9-Feb-2010 07:07:10 (1mins, 36sec)
> >>>
> >>>  -Prasen
> >>>
> >>
> >>
> >> --
> >> Eric Sammer
> >> [email protected]
> >> http://esammer.blogspot.com
> >>
> >
>

Reply via email to