Ok, I found the reason. In my QueueStream example, I have a while(true) which keeps on adding the RDDs, my awaitTermination call if after the while loop. Since, the while loop never exits, awaitTermination never gets fired and never get reported the exceptions.
The above was just the problem with the code that I tried to show my problem with. My real problem was due to the shutdown behavior of Spark. Spark streaming does the following - context.start() triggers the pipeline, context.awaitTerminate() block the current thread, whenever an exception is reported, awaitTerminated throws an exception. Since generally, we never have any code after awaitTerminate, the shutdown hooks get called which stops the spark context. - I am using spark-jobserver, when an exception is reported from awaitTerminate, jobserver catches the exception and updates the status of job in database but the driver process keeps on running because the main thread in driver is waiting for an Akka actor to shutdown which belongs to jobserver. Since, it never shutsdown, the driver keeps on running and no one executes a context.stop(). Since context.stop() is not executed, the jobschedular and generator keeps on running and job also keeps on going. This implicit behavior of Spark where it relies on shutdown hooks to close the context is a bit strange. I believe that as soon as an exception is reported, the spark should just execute context.stop(). This behavior can have serious consequence e.g. data loss. Will fix it though. What is your opinion on stopping the context as soon as an exception is raised? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org