Mark Hamstra created SPARK-1620: ----------------------------------- Summary: Uncaught exception from Akka scheduler Key: SPARK-1620 URL: https://issues.apache.org/jira/browse/SPARK-1620 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 0.9.0, 1.0.0 Reporter: Mark Hamstra Priority: Blocker
I've been looking at this one in the context of a BlockManagerMaster that OOMs and doesn't respond to heartBeat(), but I suspect that there may be problems elsewhere where we use Akka's scheduler. The basic nature of the problem is that we are expecting exceptions thrown from a scheduled function to be caught in the thread where _ActorSystem_.scheduler.schedule() or scheduleOnce() has been called. In fact, the scheduled function runs on its own thread, so any exceptions that it throws are not caught in the thread that called schedule() -- e.g., unanswered BlockManager heartBeats (scheduled in BlockManager#initialize) that end up throwing exceptions in BlockManagerMaster#askDriverWithReply do not cause those exceptions to be handled by the Executor thread's UncaughtExceptionHandler. -- This message was sent by Atlassian JIRA (v6.2#6252)