Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/4021#discussion_r22889328
--- Diff: core/src/main/scala/org/apache/spark/Accumulators.scala ---
@@ -280,10 +281,12 @@ object AccumulatorParam {
// TODO: The multi-thread support in accumulators is kind of lame; check
// if there's a more intuitive way of doing it right
private[spark] object Accumulators {
- // TODO: Use soft references? => need to make readObject work properly
then
- val originals = Map[Long, Accumulable[_, _]]()
- val localAccums = new ThreadLocal[Map[Long, Accumulable[_, _]]]() {
- override protected def initialValue() = Map[Long, Accumulable[_, _]]()
+ // Store a WeakReference instead of a StrongReference because this way
accumulators can be
+ // appropriately garbage collected during long-running jobs and release
memory
+ type WeakAcc = WeakReference[Accumulable[_, _]]
+ val originals = Map[Long, WeakAcc]()
+ val localAccums = new ThreadLocal[Map[Long, WeakAcc]]() {
--- End diff --
Guava MapMaper supports `weakValues`; not sure if we want to use that here,
since it's not super Scala-friendly (e.g. returns nulls, etc):
http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/collect/MapMaker.html
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]