Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21342#discussion_r189754010
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
 ---
    @@ -111,12 +112,18 @@ case class BroadcastExchangeExec(
               SQLMetrics.postDriverMetricUpdates(sparkContext, executionId, 
metrics.values.toSeq)
               broadcasted
             } catch {
    +          // SPARK-24294: To bypass scala bug: 
https://github.com/scala/bug/issues/9554, we throw
    +          // SparkFatalException, which is a subclass of Exception. 
ThreadUtils.awaitResult
    +          // will catch this exception and re-throw the wrapped fatal 
throwable.
               case oe: OutOfMemoryError =>
    -            throw new OutOfMemoryError(s"Not enough memory to build and 
broadcast the table to " +
    +            throw new SparkFatalException(
    +              new OutOfMemoryError(s"Not enough memory to build and 
broadcast the table to " +
    --- End diff --
    
    I agree that we're likely to have reclaimable space at this point, so the 
chance of a second OOM / failure here seems small. I'm pretty sure that the 
OutOfMemoryError being caught here often originates from Spark itself where we 
explicitly throw another `OutOfMemoryError` at a lower layer of the system, in 
which case we still actually have heap to allocate strings. We should 
investigate and clean up that practice, but let's do that in a separate PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to