Shutdown Spark application with failed state
Hello all, I am using Spark 2.x streaming with kafka. I noticed that spark streaming is processing subsequent micro-batches in case of failure as it takes a while to notify the driver about the error and interrupt streaming-executor thread. This is creating a problem as we are checkpointing the offsets internally. To avoid the problem, we wanted to catch the exception in the RDD process and stop the spark streaming immediately. streamRDD.foreachRDD { (rdd, microBatchTime) => { try { // business logic }catch (Exception ex) { case ex: Exception => // stop spark streaming streamingContext.stop(stopSparkContext = true, stopGracefully = false) } } } But the spark application state is set to Completed. So, the application is not restarted automatically by spark (with max attempts config). I checked if there is a way to notify the error during the shutdown which sets the spark application status to Failed. ContextWaiter#notiftError is steaming package scoped and couldn’t find any other interfaces to propagate the error/exception to stop the process. How to tell spark streaming to stop processing subsequent micro batches if a micro-batch throws an exception ? Is it possible to configure spark to create one micro batch RDD at a time ? How to stop the spark streaming context with error ? Any help would be appreciated. Thanks in advance. Regards.
Shutdown Spark application with failed state
Hi Team, I am using Spark 2.x streaming with kafka. I noticed that spark streaming is processing subsequent micro-batches in case of failure as it takes a while to notify the driver about the error and interrupt streaming-executor thread. This is creating problem as we are checkpointing the offsets internally. To avoid the problem, we wanted to catch the exception in in RDD process and stop the spark streaming immediately. streamRDD.foreachRDD { (rdd, microBatchTime) => { try { // business logi }catch (Exception ex) { case ex: Exception => // stop spark streaming streamingContext.stop(stopSparkContext = true, stopGracefully = false) } } } But the spark application state is set to Completed. So, application is not restarted automatically by spark (with max attempts config). I checked if there is a way to notify the error during the shutdown which sets the spark application status to Failed. ContextWaiter#notiftError is steaming package scoped and couldn’t find any other interfaces to propagate the error/exception to stop process. How to tell spark streaming to stop processing subsequent micro batches if a micro-batch throws an exception ? Is it possible to configure spark to create one micro batch RDD at a time ? How to stop the spark streaming context with error ? Any help would be appreciated. Thanks in advance. Regards.