[
https://issues.apache.org/jira/browse/SPARK-21425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092064#comment-16092064
]
Shixiong Zhu commented on SPARK-21425:
--------------------------------------
[~srowen]
1. Long/DoubleAccumulator assumes that there is only one writing thread: the
thread running the task. If someone spawns a thread to update
Long/DoubleAccumulator, he/she should add necessary synchronization.
2. It also assumes there are only two reading threads: the thread running the
task, and the executor heartbeat thread. When reporting the heartbeat, executor
will collect all current values in Long/DoubleAccumulator (usually they are
just metrics) and report to driver. And we use them only on UI. The user codes
won't see them. Without the synchronization, the worst case is the user cannot
see the latest values on UI but that's acceptable.
3. When a task finishes, the thread running the task will collect all values of
accumulators and see it back to driver, and it will see all latest and correct
values.
Changing to AtomicLong will introduce a volatile variable and make updating
accumulators much slower.
> LongAccumulator, DoubleAccumulator not threadsafe
> -------------------------------------------------
>
> Key: SPARK-21425
> URL: https://issues.apache.org/jira/browse/SPARK-21425
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.2.0
> Reporter: Ryan Williams
> Priority: Minor
>
> [AccumulatorV2
> docs|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L42-L43]
> acknowledge that accumulators must be concurrent-read-safe, but afaict they
> must also be concurrent-write-safe.
> The same docs imply that {{Int}} and {{Long}} meet either/both of these
> criteria, when afaict they do not.
> Relatedly, the provided
> [LongAccumulator|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L291]
> and
> [DoubleAccumulator|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L370]
> are not thread-safe, and should be expected to behave undefinedly when
> multiple concurrent tasks on the same executor write to them.
> [Here is a repro repo|https://github.com/ryan-williams/spark-bugs/tree/accum]
> with some simple applications that demonstrate incorrect results from
> {{LongAccumulator}}'s.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]