GitHub user aarondav opened a pull request:
https://github.com/apache/spark/pull/715
[RFC] SPARK-1772 Stop catching Throwable, let Executors die
The main issue this patch fixes is
[SPARK-1772](https://issues.apache.org/jira/browse/SPARK-1772), in which
Executors may not die when fatal exceptions (e.g., OOM) are thrown. This patch
causes Executors to delegate to the ExecutorUncaughtExceptionHandler when a
fatal exception is thrown.
This patch also continues the fight in the neverending war against `case t:
Throwable =>`, by only catching Exceptions in many places, and adding a wrapper
for Threads and Runnables to make sure any uncaught exceptions are at least
printed to the logs.
It also turns out that it is unlikely that the IndestructibleActorSystem
actually works, given testing
([here](https://gist.github.com/aarondav/ca1f0cdcd50727f89c0d)). The
uncaughtExceptionHandler is not called from the places that we expected it
would be.
[SPARK-1620](https://issues.apache.org/jira/browse/SPARK-1620) deals with
part of this issue, but refactoring our Actor Systems to ensure that exceptions
are dealt with properly is a much bigger change, outside the scope of this PR.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/aarondav/spark throwable
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/715.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #715
----
commit 1867867a00241ff1bd20d2ac3ac610ed126a9280
Author: Aaron Davidson <[email protected]>
Date: 2014-05-09T20:28:26Z
[RFC] SPARK-1772 Stop catching Throwable, let Executors die
The main issue this patch fixes is
[SPARK-1772](https://issues.apache.org/jira/browse/SPARK-1772),
in which Executors may not die when fatal exceptions (e.g., OOM) are
thrown. This patch causes
Executors to delegate to the ExecutorUncaughtExceptionHandler when a fatal
exception is thrown.
This patch also continues the fight in the neverending war against `case t:
Throwable =>`,
by only catching Exceptions in many places, and adding a wrapper for
Threads and Runnables
to make sure any uncaught exceptions are at least printed to the logs.
It also turns out that it is unlikely that the IndestructibleActorSystem
actually works,
given testing
([here](https://gist.github.com/aarondav/ca1f0cdcd50727f89c0d)). The
uncaughtExceptionHandler is not called from the places that we expected it
would be.
[SPARK-1620](https://issues.apache.org/jira/browse/SPARK-1620) deals with
part of this
issue, but refactoring our Actor Systems to ensure that exceptions are
dealt with properly
is a much bigger change, outside the scope of this PR.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---