Re: Understanding accumulator during transformations

2015-06-24 Thread Wei Zhou
Hi Burak,

It makes sense, it boils down to any actions happens after transformations
then. Thanks for your answers.

Best,
Wei

2015-06-24 15:06 GMT-07:00 Burak Yavuz :

> Hi Wei,
>
> During the action, all the transformations before it will occur in order
> leading up to the action. If you have an accumulator in any of these
> transformations, then you won't get exactly once semantics, because the
> transformation may be restarted elsewhere.
>
> Bet,
> Burak
>
> On Wed, Jun 24, 2015 at 2:25 PM, Wei Zhou  wrote:
>
>> Hi Burak,
>>
>> Thanks for your quick reply. I guess what confuses me is that accumulator
>> won't be updated until an action is used due to the laziness, so
>> transformation such as a map won't even update the accumulator, then how
>> would restarted the transformation ended up updating accumulator more than
>> once?
>>
>> Best,
>> Wei
>>
>> 2015-06-24 13:23 GMT-07:00 Burak Yavuz :
>>
>>> Hi Wei,
>>>
>>> For example, when a straggler executor gets killed in the middle of a
>>> map operation and it's task is restarted at a different instance, the
>>> accumulator will be updated more than once.
>>>
>>> Best,
>>> Burak
>>>
>>> On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou  wrote:
>>>
 Quoting from Spark Program guide:

 "For accumulator updates performed inside *actions only*, Spark
 guarantees that each task’s update to the accumulator will only be applied
 once, i.e. restarted tasks will not update the value. In transformations,
 users should be aware of that each task’s update may be applied more than
 once if tasks or job stages are re-executed."

 Can anyone gives me a possible scenario of when accumulator might be
 updated more than once during transformation? Thanks.

 Regards,
 Wei

>>>
>>>
>>
>


Re: Understanding accumulator during transformations

2015-06-24 Thread Burak Yavuz
Hi Wei,

During the action, all the transformations before it will occur in order
leading up to the action. If you have an accumulator in any of these
transformations, then you won't get exactly once semantics, because the
transformation may be restarted elsewhere.

Bet,
Burak

On Wed, Jun 24, 2015 at 2:25 PM, Wei Zhou  wrote:

> Hi Burak,
>
> Thanks for your quick reply. I guess what confuses me is that accumulator
> won't be updated until an action is used due to the laziness, so
> transformation such as a map won't even update the accumulator, then how
> would restarted the transformation ended up updating accumulator more than
> once?
>
> Best,
> Wei
>
> 2015-06-24 13:23 GMT-07:00 Burak Yavuz :
>
>> Hi Wei,
>>
>> For example, when a straggler executor gets killed in the middle of a map
>> operation and it's task is restarted at a different instance, the
>> accumulator will be updated more than once.
>>
>> Best,
>> Burak
>>
>> On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou  wrote:
>>
>>> Quoting from Spark Program guide:
>>>
>>> "For accumulator updates performed inside *actions only*, Spark
>>> guarantees that each task’s update to the accumulator will only be applied
>>> once, i.e. restarted tasks will not update the value. In transformations,
>>> users should be aware of that each task’s update may be applied more than
>>> once if tasks or job stages are re-executed."
>>>
>>> Can anyone gives me a possible scenario of when accumulator might be
>>> updated more than once during transformation? Thanks.
>>>
>>> Regards,
>>> Wei
>>>
>>
>>
>


Re: Understanding accumulator during transformations

2015-06-24 Thread Wei Zhou
Hi Burak,

Thanks for your quick reply. I guess what confuses me is that accumulator
won't be updated until an action is used due to the laziness, so
transformation such as a map won't even update the accumulator, then how
would restarted the transformation ended up updating accumulator more than
once?

Best,
Wei

2015-06-24 13:23 GMT-07:00 Burak Yavuz :

> Hi Wei,
>
> For example, when a straggler executor gets killed in the middle of a map
> operation and it's task is restarted at a different instance, the
> accumulator will be updated more than once.
>
> Best,
> Burak
>
> On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou  wrote:
>
>> Quoting from Spark Program guide:
>>
>> "For accumulator updates performed inside *actions only*, Spark
>> guarantees that each task’s update to the accumulator will only be applied
>> once, i.e. restarted tasks will not update the value. In transformations,
>> users should be aware of that each task’s update may be applied more than
>> once if tasks or job stages are re-executed."
>>
>> Can anyone gives me a possible scenario of when accumulator might be
>> updated more than once during transformation? Thanks.
>>
>> Regards,
>> Wei
>>
>
>


Re: Understanding accumulator during transformations

2015-06-24 Thread Burak Yavuz
Hi Wei,

For example, when a straggler executor gets killed in the middle of a map
operation and it's task is restarted at a different instance, the
accumulator will be updated more than once.

Best,
Burak

On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou  wrote:

> Quoting from Spark Program guide:
>
> "For accumulator updates performed inside *actions only*, Spark
> guarantees that each task’s update to the accumulator will only be applied
> once, i.e. restarted tasks will not update the value. In transformations,
> users should be aware of that each task’s update may be applied more than
> once if tasks or job stages are re-executed."
>
> Can anyone gives me a possible scenario of when accumulator might be
> updated more than once during transformation? Thanks.
>
> Regards,
> Wei
>