Colin Woodbury created SPARK-21022: -------------------------------------- Summary: foreach swallows exceptions Key: SPARK-21022 URL: https://issues.apache.org/jira/browse/SPARK-21022 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.1.1 Reporter: Colin Woodbury Priority: Minor
A `RDD.foreach` or `RDD.foreachPartition` call will swallow Exceptions thrown inside its closure, but not if the exception was thrown earlier in the call chain. An example: {code:none} package examples import org.apache.spark._ object Shpark { def main(args: Array[String]) { implicit val sc: SparkContext = new SparkContext( new SparkConf().setMaster("local[*]").setAppName("blahfoobar") ) /* DOESN'T THROW sc.parallelize(0 until 10000000) .foreachPartition { _.map { i => println("BEFORE THROW") throw new Exception("Testing exception handling") println(i) }} */ /* DOESN'T THROW, nor does anything print. * Commenting out the exception runs the prints. * (i.e. `foreach` is sufficient to "run" an RDD) sc.parallelize(0 until 100000) .foreach({ i => println("BEFORE THROW") throw new Exception("Testing exception handling") println(i) }) */ /* Throws! */ sc.parallelize(0 until 100000) .map({ i => println("BEFORE THROW") throw new Exception("Testing exception handling") i }) .foreach(i => println(i)) println("JOB DONE!") System.in.read sc.stop() } } {code} When exceptions are swallowed, the jobs don't seem to fail, and the driver exits normally. When one _is_ thrown, as in the last example, the exception successfully rises up to the driver and can be caught with `catch`. The expected behaviour is for exceptions in `foreach` to throw and crash the driver, as they would with `map`. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org