Hi Burak, Thanks for your quick reply. I guess what confuses me is that accumulator won't be updated until an action is used due to the laziness, so transformation such as a map won't even update the accumulator, then how would restarted the transformation ended up updating accumulator more than once?
Best, Wei 2015-06-24 13:23 GMT-07:00 Burak Yavuz <brk...@gmail.com>: > Hi Wei, > > For example, when a straggler executor gets killed in the middle of a map > operation and it's task is restarted at a different instance, the > accumulator will be updated more than once. > > Best, > Burak > > On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou <zhweisop...@gmail.com> wrote: > >> Quoting from Spark Program guide: >> >> "For accumulator updates performed inside *actions only*, Spark >> guarantees that each task’s update to the accumulator will only be applied >> once, i.e. restarted tasks will not update the value. In transformations, >> users should be aware of that each task’s update may be applied more than >> once if tasks or job stages are re-executed." >> >> Can anyone gives me a possible scenario of when accumulator might be >> updated more than once during transformation? Thanks. >> >> Regards, >> Wei >> > >