GitHub user aarondav opened a pull request:

    https://github.com/apache/spark/pull/715

    [RFC] SPARK-1772 Stop catching Throwable, let Executors die

    The main issue this patch fixes is 
[SPARK-1772](https://issues.apache.org/jira/browse/SPARK-1772), in which 
Executors may not die when fatal exceptions (e.g., OOM) are thrown. This patch 
causes Executors to delegate to the ExecutorUncaughtExceptionHandler when a 
fatal exception is thrown.
    
    This patch also continues the fight in the neverending war against `case t: 
Throwable =>`, by only catching Exceptions in many places, and adding a wrapper 
for Threads and Runnables to make sure any uncaught exceptions are at least 
printed to the logs.
    
    It also turns out that it is unlikely that the IndestructibleActorSystem 
actually works, given testing 
([here](https://gist.github.com/aarondav/ca1f0cdcd50727f89c0d)). The 
uncaughtExceptionHandler is not called from the places that we expected it 
would be.
    [SPARK-1620](https://issues.apache.org/jira/browse/SPARK-1620) deals with 
part of this issue, but refactoring our Actor Systems to ensure that exceptions 
are dealt with properly is a much bigger change, outside the scope of this PR.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aarondav/spark throwable

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/715.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #715
    
----
commit 1867867a00241ff1bd20d2ac3ac610ed126a9280
Author: Aaron Davidson <[email protected]>
Date:   2014-05-09T20:28:26Z

    [RFC] SPARK-1772 Stop catching Throwable, let Executors die
    
    The main issue this patch fixes is 
[SPARK-1772](https://issues.apache.org/jira/browse/SPARK-1772),
    in which Executors may not die when fatal exceptions (e.g., OOM) are 
thrown. This patch causes
    Executors to delegate to the ExecutorUncaughtExceptionHandler when a fatal 
exception is thrown.
    
    This patch also continues the fight in the neverending war against `case t: 
Throwable =>`,
    by only catching Exceptions in many places, and adding a wrapper for 
Threads and Runnables
    to make sure any uncaught exceptions are at least printed to the logs.
    
    It also turns out that it is unlikely that the IndestructibleActorSystem 
actually works,
    given testing 
([here](https://gist.github.com/aarondav/ca1f0cdcd50727f89c0d)). The
    uncaughtExceptionHandler is not called from the places that we expected it 
would be.
    [SPARK-1620](https://issues.apache.org/jira/browse/SPARK-1620) deals with 
part of this
    issue, but refactoring our Actor Systems to ensure that exceptions are 
dealt with properly
    is a much bigger change, outside the scope of this PR.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to