[ 
https://issues.apache.org/jira/browse/SPARK-21425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092064#comment-16092064
 ] 

Shixiong Zhu commented on SPARK-21425:
--------------------------------------

[~srowen]

1. Long/DoubleAccumulator assumes that there is only one writing thread: the 
thread running the task. If someone spawns a thread to update 
Long/DoubleAccumulator, he/she should add necessary synchronization.
2. It also assumes there are only two reading threads: the thread running the 
task, and the executor heartbeat thread. When reporting the heartbeat, executor 
will collect all current values in Long/DoubleAccumulator (usually they are 
just metrics) and report to driver. And we use them only on UI. The user codes 
won't see them. Without the synchronization, the worst case is the user cannot 
see the latest values on UI but that's acceptable.
3. When a task finishes, the thread running the task will collect all values of 
accumulators and see it back to driver, and it will see all latest and correct 
values.

Changing to AtomicLong will introduce a volatile variable and make updating 
accumulators much slower.


> LongAccumulator, DoubleAccumulator not threadsafe
> -------------------------------------------------
>
>                 Key: SPARK-21425
>                 URL: https://issues.apache.org/jira/browse/SPARK-21425
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Ryan Williams
>            Priority: Minor
>
> [AccumulatorV2 
> docs|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L42-L43]
>  acknowledge that accumulators must be concurrent-read-safe, but afaict they 
> must also be concurrent-write-safe.
> The same docs imply that {{Int}} and {{Long}} meet either/both of these 
> criteria, when afaict they do not.
> Relatedly, the provided 
> [LongAccumulator|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L291]
>  and 
> [DoubleAccumulator|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L370]
>  are not thread-safe, and should be expected to behave undefinedly when 
> multiple concurrent tasks on the same executor write to them.
> [Here is a repro repo|https://github.com/ryan-williams/spark-bugs/tree/accum] 
> with some simple applications that demonstrate incorrect results from 
> {{LongAccumulator}}'s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to