Github user ilganeli commented on a diff in the pull request:
https://github.com/apache/spark/pull/4021#discussion_r23018036
--- Diff: core/src/main/scala/org/apache/spark/Accumulators.scala ---
@@ -280,10 +281,12 @@ object AccumulatorParam {
// TODO: The multi-thread support in accumulators is kind of lame; check
// if there's a more intuitive way of doing it right
private[spark] object Accumulators {
- // TODO: Use soft references? => need to make readObject work properly
then
- val originals = Map[Long, Accumulable[_, _]]()
- val localAccums = new ThreadLocal[Map[Long, Accumulable[_, _]]]() {
- override protected def initialValue() = Map[Long, Accumulable[_, _]]()
+ // Store a WeakReference instead of a StrongReference because this way
accumulators can be
+ // appropriately garbage collected during long-running jobs and release
memory
+ type WeakAcc = WeakReference[Accumulable[_, _]]
+ val originals = Map[Long, WeakAcc]()
+ val localAccums = new ThreadLocal[Map[Long, WeakAcc]]() {
--- End diff --
Hi Josh - are you suggesting to replace this snippet with a MapMaker just
to simplify the initialization code? I believe the usage of either object would
be the same - do you see a specific advantage to trying to use the MapMaker?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]