Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/1482#issuecomment-54673163
  
    Here's my understanding of the flow of control that produced the original 
problem:
    
    A task throws an uncaught exception (let's say OutOfMemoryError).  This is 
caught by the `case t: Throwable` block.  If this is the first task to fail 
with a Throwable, then `Utils.inShutdown()` should return false because no 
shutdown hook will be running.  If this Throwable indicates a fatal error, then 
that thread runs `ExecutorUncaughtExceptionHandler.uncaughtException(t)`.  
[That 
handler](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/ExecutorUncaughtExceptionHandler.scala)
 calls `System.exit()`, which begins running the shutdown hooks.
     
    While the shutdown hook is running, it might delete files or other 
resources needed by other tasks that are still running.  When these secondary 
errors occur, they are caught by the same `case t: Throwable` case, except this 
time `Utils.inShutdown()` might return `True`, causing the exception to be 
ignored.
    
    I suppose we could add an `else` clause that logs a note when we're 
suppressing exceptions.
    
    Anyhow, I think that we'll still report the partial task metrics, etc. but 
only as of the time at the first task fails.  Prior to this patch, we might 
attempt to send multiple metrics and status updates from the secondary failures 
caused by the shutdown.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to