GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/3372
[SPARK-4495] Fix memory leaks in JobProgressListener
This commit fixes a memory leak in JobProgressListener that I introduced in
SPARK-2321 and adds a testing framework to ensure that itâs very difficult to
inadvertently introduce new memory leaks.
This solution might be overkill, but the main idea is to partition
JobProgressListener's state into three buckets: collections that should be
empty once Spark is idle, collections that _always_ must obey some hard size
limit, and collections that have a soft size limit (they can grow arbitrarily
large when Spark is active but must shrink to fit within some bound after Spark
becomes idle).
Based on this, we can write fairly generic tests that run workloads that
submit more than `spark.ui.retainedStages` stages and `spark.ui.retainedJobs`
jobs then check that these various collections' sizes obey these contracts.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark SPARK-4495
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/3372.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3372
----
commit be72e81a6726289f8dd9ba99271ff0ba47f67cdb
Author: Josh Rosen <[email protected]>
Date: 2014-11-19T22:41:58Z
[SPARK-4495] Fix memory leaks in JobProgressListener
This commit fixes a memory leak in JobProgressListener that I introduced
in SPARK-2321 and adds a testing framework to ensure that itâs very
difficult to inadvertently introduce new memory leaks.
commit c73fab5b1419a3870f8b84407b3e29ab87f238a8
Author: Josh Rosen <[email protected]>
Date: 2014-11-19T22:50:37Z
"data structures" -> collections
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]