Github user JoshRosen commented on a diff in the pull request: https://github.com/apache/spark/pull/21342#discussion_r189754010 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -111,12 +112,18 @@ case class BroadcastExchangeExec( SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, metrics.values.toSeq) broadcasted } catch { + // SPARK-24294: To bypass scala bug: https://github.com/scala/bug/issues/9554, we throw + // SparkFatalException, which is a subclass of Exception. ThreadUtils.awaitResult + // will catch this exception and re-throw the wrapped fatal throwable. case oe: OutOfMemoryError => - throw new OutOfMemoryError(s"Not enough memory to build and broadcast the table to " + + throw new SparkFatalException( + new OutOfMemoryError(s"Not enough memory to build and broadcast the table to " + --- End diff -- I agree that we're likely to have reclaimable space at this point, so the chance of a second OOM / failure here seems small. I'm pretty sure that the OutOfMemoryError being caught here often originates from Spark itself where we explicitly throw another `OutOfMemoryError` at a lower layer of the system, in which case we still actually have heap to allocate strings. We should investigate and clean up that practice, but let's do that in a separate PR.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org