Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/10835#discussion_r50806915
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -1074,39 +1074,43 @@ class DAGScheduler(
}
}
- /** Merge updates from a task to our local accumulator values */
+ /**
+ * Merge local values from a task into the corresponding accumulators
previously registered
+ * here on the driver.
+ *
+ * Although accumulators themselves are not thread-safe, this method is
called only from one
+ * thread, the one that runs the scheduling loop. This means we only
handle one task
+ * completion event at a time so we don't need to worry about locking
the accumulators.
+ * This still doesn't stop the caller from updating the accumulator
outside the scheduler,
+ * but that's not our problem since there's nothing we can do about that.
--- End diff --
I can't imagine that users would have a legitimate use-case for updating
accumulators on the driver unless they're trying to use them as a substitute
for AtomicLong or something like that.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]