Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21899#discussion_r210647464
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
 ---
    @@ -118,12 +119,19 @@ case class BroadcastExchangeExec(
               // SparkFatalException, which is a subclass of Exception. 
ThreadUtils.awaitResult
               // will catch this exception and re-throw the wrapped fatal 
throwable.
               case oe: OutOfMemoryError =>
    -            throw new SparkFatalException(
    +            val sizeMessage = if (dataSize != -1) {
    +              s"; Size of table is $dataSize"
    +            } else {
    +              ""
    +            }
    +            val oome =
                   new OutOfMemoryError(s"Not enough memory to build and 
broadcast the table to " +
                   s"all worker nodes. As a workaround, you can either disable 
broadcast by setting " +
                   s"${SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key} to -1 or 
increase the spark driver " +
    -              s"memory by setting ${SparkLauncher.DRIVER_MEMORY} to a 
higher value")
    -              .initCause(oe.getCause))
    +              s"memory by setting ${SparkLauncher.DRIVER_MEMORY} to a 
higher value$sizeMessage")
    --- End diff --
    
    I'm a bit hesitant to suggest exactly what value to say here, as we don't 
actually know how close we were to having it all fit.  I tried to come up with 
something useful to say here which would be accurate and succinct, but felt it 
either got too convoluted, or might only lead to more confusion as it was too 
simplistic.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to