[ 
https://issues.apache.org/jira/browse/SPARK-21425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16092751#comment-16092751
 ] 

Sean Owen commented on SPARK-21425:
-----------------------------------

Interesting [~zsxwing], so you're saying this scenario shouldn't happen in real 
usage. What do you make of the Main.scala file here though? it seems to show it 
can happen.

If only one thread is writing, what's the overhead? the writes aren't 
contended. Is it really reads contending with each other? because we can 
structure this to allow concurrent reads.

I'm still slightly concerned that you can read, for example, a bogus value of 
avg() if you see the sum update and not count, but that may well be a very rare 
cosmetic-only problem. Still a little funny.

> LongAccumulator, DoubleAccumulator not threadsafe
> -------------------------------------------------
>
>                 Key: SPARK-21425
>                 URL: https://issues.apache.org/jira/browse/SPARK-21425
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Ryan Williams
>            Priority: Minor
>
> [AccumulatorV2 
> docs|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L42-L43]
>  acknowledge that accumulators must be concurrent-read-safe, but afaict they 
> must also be concurrent-write-safe.
> The same docs imply that {{Int}} and {{Long}} meet either/both of these 
> criteria, when afaict they do not.
> Relatedly, the provided 
> [LongAccumulator|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L291]
>  and 
> [DoubleAccumulator|https://github.com/apache/spark/blob/v2.2.0/core/src/main/scala/org/apache/spark/util/AccumulatorV2.scala#L370]
>  are not thread-safe, and should be expected to behave undefinedly when 
> multiple concurrent tasks on the same executor write to them.
> [Here is a repro repo|https://github.com/ryan-williams/spark-bugs/tree/accum] 
> with some simple applications that demonstrate incorrect results from 
> {{LongAccumulator}}'s.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to