Re: Understanding accumulator during transformations
Hi Burak, It makes sense, it boils down to any actions happens after transformations then. Thanks for your answers. Best, Wei 2015-06-24 15:06 GMT-07:00 Burak Yavuz : > Hi Wei, > > During the action, all the transformations before it will occur in order > leading up to the action. If you have an accumulator in any of these > transformations, then you won't get exactly once semantics, because the > transformation may be restarted elsewhere. > > Bet, > Burak > > On Wed, Jun 24, 2015 at 2:25 PM, Wei Zhou wrote: > >> Hi Burak, >> >> Thanks for your quick reply. I guess what confuses me is that accumulator >> won't be updated until an action is used due to the laziness, so >> transformation such as a map won't even update the accumulator, then how >> would restarted the transformation ended up updating accumulator more than >> once? >> >> Best, >> Wei >> >> 2015-06-24 13:23 GMT-07:00 Burak Yavuz : >> >>> Hi Wei, >>> >>> For example, when a straggler executor gets killed in the middle of a >>> map operation and it's task is restarted at a different instance, the >>> accumulator will be updated more than once. >>> >>> Best, >>> Burak >>> >>> On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou wrote: >>> Quoting from Spark Program guide: "For accumulator updates performed inside *actions only*, Spark guarantees that each task’s update to the accumulator will only be applied once, i.e. restarted tasks will not update the value. In transformations, users should be aware of that each task’s update may be applied more than once if tasks or job stages are re-executed." Can anyone gives me a possible scenario of when accumulator might be updated more than once during transformation? Thanks. Regards, Wei >>> >>> >> >
Re: Understanding accumulator during transformations
Hi Wei, During the action, all the transformations before it will occur in order leading up to the action. If you have an accumulator in any of these transformations, then you won't get exactly once semantics, because the transformation may be restarted elsewhere. Bet, Burak On Wed, Jun 24, 2015 at 2:25 PM, Wei Zhou wrote: > Hi Burak, > > Thanks for your quick reply. I guess what confuses me is that accumulator > won't be updated until an action is used due to the laziness, so > transformation such as a map won't even update the accumulator, then how > would restarted the transformation ended up updating accumulator more than > once? > > Best, > Wei > > 2015-06-24 13:23 GMT-07:00 Burak Yavuz : > >> Hi Wei, >> >> For example, when a straggler executor gets killed in the middle of a map >> operation and it's task is restarted at a different instance, the >> accumulator will be updated more than once. >> >> Best, >> Burak >> >> On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou wrote: >> >>> Quoting from Spark Program guide: >>> >>> "For accumulator updates performed inside *actions only*, Spark >>> guarantees that each task’s update to the accumulator will only be applied >>> once, i.e. restarted tasks will not update the value. In transformations, >>> users should be aware of that each task’s update may be applied more than >>> once if tasks or job stages are re-executed." >>> >>> Can anyone gives me a possible scenario of when accumulator might be >>> updated more than once during transformation? Thanks. >>> >>> Regards, >>> Wei >>> >> >> >
Re: Understanding accumulator during transformations
Hi Burak, Thanks for your quick reply. I guess what confuses me is that accumulator won't be updated until an action is used due to the laziness, so transformation such as a map won't even update the accumulator, then how would restarted the transformation ended up updating accumulator more than once? Best, Wei 2015-06-24 13:23 GMT-07:00 Burak Yavuz : > Hi Wei, > > For example, when a straggler executor gets killed in the middle of a map > operation and it's task is restarted at a different instance, the > accumulator will be updated more than once. > > Best, > Burak > > On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou wrote: > >> Quoting from Spark Program guide: >> >> "For accumulator updates performed inside *actions only*, Spark >> guarantees that each task’s update to the accumulator will only be applied >> once, i.e. restarted tasks will not update the value. In transformations, >> users should be aware of that each task’s update may be applied more than >> once if tasks or job stages are re-executed." >> >> Can anyone gives me a possible scenario of when accumulator might be >> updated more than once during transformation? Thanks. >> >> Regards, >> Wei >> > >
Re: Understanding accumulator during transformations
Hi Wei, For example, when a straggler executor gets killed in the middle of a map operation and it's task is restarted at a different instance, the accumulator will be updated more than once. Best, Burak On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou wrote: > Quoting from Spark Program guide: > > "For accumulator updates performed inside *actions only*, Spark > guarantees that each task’s update to the accumulator will only be applied > once, i.e. restarted tasks will not update the value. In transformations, > users should be aware of that each task’s update may be applied more than > once if tasks or job stages are re-executed." > > Can anyone gives me a possible scenario of when accumulator might be > updated more than once during transformation? Thanks. > > Regards, > Wei >