GitHub user andrewor14 opened a pull request:
https://github.com/apache/spark/pull/4869
[SPARK-6132] ContextCleaner may live across SparkContexts
The problem is that `ContextCleaner` may clean variables that belong to a
different `SparkContext`. This can happen if the `SparkContext` to which the
cleaner belongs stops, and a new one is started immediately afterwards in the
same JVM. In this case, if the cleaner is in the middle of cleaning a
broadcast, for instance, it will do so through `SparkEnv.get.blockManager`,
which could be one that belongs to a different `SparkContext`.
@JoshRosen and I suspect that this is the cause of many flaky tests, most
notably the `JavaAPISuite`. We were able to reproduce this locally (though it
is not deterministic).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/andrewor14/spark cleaner-masquerade
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4869.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4869
----
commit 29168c07ea855028bf1ddc72a6357ae3197ac87d
Author: Andrew Or <[email protected]>
Date: 2015-03-03T06:43:32Z
Synchronize ContextCleaner stop
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]