GitHub user vanzin opened a pull request:
https://github.com/apache/spark/pull/5335
[SPARK-4194] [core] Make SparkContext initialization exception-safe.
SparkContext has a very long constructor, where multiple things are
initialized, multiple threads are spawned, and multiple opportunities
for exceptions to be thrown exist. If one of these happens at an
innoportune time, lots of garbage tends to stick around.
This patch re-organizes SparkContext so that its internal state is
initialized in a big "try" block. The fields keeping state are now
completely private to SparkContext, and are "vars", because Scala
doesn't allow you to initialize a val later. The existing API interface
is kept by turning vals into defs (which works because Scala guarantees
the same binary interface for those).
On top of that, a few things in other areas were changed to avoid more
things leaking:
- Executor was changed to explicitly wait for the heartbeat thread to
stop. LocalBackend was changed to wait for the "StopExecutor"
message to be received, since otherwise there could be a race
between that message arriving and the actor system being shut down.
- ConnectionManager could possibly hang during shutdown, because an
interrupt at the wrong moment could cause the selector thread to
still call select and then wait forever. So also wake up the
selector so that this situation is avoided.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/vanzin/spark SPARK-4194
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/5335.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5335
----
commit 5545d837429f371a411c0e40c5eebeb2fd402ed7
Author: Marcelo Vanzin <[email protected]>
Date: 2015-04-01T16:49:47Z
[SPARK-6650] [core] Stop ExecutorAllocationManager when context stops.
This fixes the thread leak. I also changed the unit test to keep track
of allocated contexts and making sure they're closed after tests are
run; this is needed since some tests use this pattern:
val sc = createContext()
doSomethingThatMayThrow()
sc.stop()
commit a0b0881b451376249ed2d3ea5f1f586f21ad468e
Author: Marcelo Vanzin <[email protected]>
Date: 2015-04-01T17:39:17Z
Stop alloc manager before scheduler.
commit 27456b9b5380c034e13d24c0e3e341f77b9397bf
Author: Marcelo Vanzin <[email protected]>
Date: 2015-04-01T19:55:13Z
More exception safety.
commit 071f16eb9794665178d203d245604598fa1552e4
Author: Marcelo Vanzin <[email protected]>
Date: 2015-04-01T21:17:27Z
Nits.
commit 8caa8b3a9e9a2e1cda125c1d8bff77598c15abfa
Author: Marcelo Vanzin <[email protected]>
Date: 2015-04-01T21:35:30Z
[SPARK-4194] [core] Make SparkContext initialization exception-safe.
SparkContext has a very long constructor, where multiple things are
initialized, multiple threads are spawned, and multiple opportunities
for exceptions to be thrown exist. If one of these happens at an
innoportune time, lots of garbage tends to stick around.
This patch re-organizes SparkContext so that its internal state is
initialized in a big "try" block. The fields keeping state are now
completely private to SparkContext, and are "vars", because Scala
doesn't allow you to initialize a val later. The existing API interface
is kept by turning vals into defs (which works because Scala guarantees
the same binary interface for those).
On top of that, a few things in other areas were changed to avoid more
things leaking:
- Executor was changed to explicitly wait for the heartbeat thread to
stop. LocalBackend was changed to wait for the "StopExecutor"
message to be received, since otherwise there could be a race
between that message arriving and the actor system being shut down.
- ConnectionManager could possibly hang during shutdown, because an
interrupt at the wrong moment could cause the selector thread to
still call select and then wait forever. So also wake up the
selector so that this situation is avoided.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]