GitHub user ilganeli opened a pull request:

    https://github.com/apache/spark/pull/4021

    [SPARK-3885] Provide mechanism to remove accumulators once they are no 
longer used

    Instead of storing a strong reference to accumulators, I've replaced this 
with a weak reference and updated any code that uses these accumulators to 
check whether the reference resolves before using the accumulator. A weak 
reference will be cleared when there is no longer an existing copy of the 
variable versus using a soft reference in which case accumulators would only be 
cleared when the GC explicitly ran out of memory. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ilganeli/spark SPARK-3885

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4021.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4021
    
----
commit a77d11b46c015bb5657496e971a52a853127c7cc
Author: Ilya Ganelin <[email protected]>
Date:   2015-01-07T19:57:16Z

    Updated Accumulators class to store weak references instead of strong 
references to allow garbage collection of old accumulators

commit cbb9023a84c1a06ff5b7c918cdb616f10e66d4d1
Author: Ilya Ganelin <[email protected]>
Date:   2015-01-07T19:57:24Z

    Merge remote-tracking branch 'upstream/master' into SPARK-3885

commit c49066a63d32f77476a41f71a37bd74c7315f099
Author: Ilya Ganelin <[email protected]>
Date:   2015-01-09T18:32:20Z

    Merge remote-tracking branch 'upstream/master' into SPARK-3885

commit 33508522eb58a5159a4318bb435f6c6f3b8ccecc
Author: Ilya Ganelin <[email protected]>
Date:   2015-01-09T18:44:07Z

    Updated DAGScheduler and Suite to correctly use new implementation of 
WeakRef Accumulator storage

commit 0746e615bf3cffa2d39923ebfa31b30df075c633
Author: Ilya Ganelin <[email protected]>
Date:   2015-01-09T19:15:13Z

    Updated DAGSchedulerSUite to fix bug

commit d78f4bf11465819053482b319f10cacb6e447650
Author: Ilya Ganelin <[email protected]>
Date:   2015-01-10T00:55:18Z

    Removed obsolete comment

commit b820ab4b71f12edf5d626cbdbb672a2b26c56182
Author: Ilya Ganelin <[email protected]>
Date:   2015-01-12T15:43:01Z

    reset

commit 28f705c81b7300de0162ddb85387a58a3e075410
Author: Ilya Ganelin <[email protected]>
Date:   2015-01-13T17:45:55Z

    Merge remote-tracking branch 'upstream/master' into SPARK-3885

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to