Github user squito commented on a diff in the pull request:
https://github.com/apache/spark/pull/21899#discussion_r210647464
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
---
@@ -118,12 +119,19 @@ case class BroadcastExchangeExec(
// SparkFatalException, which is a subclass of Exception.
ThreadUtils.awaitResult
// will catch this exception and re-throw the wrapped fatal
throwable.
case oe: OutOfMemoryError =>
- throw new SparkFatalException(
+ val sizeMessage = if (dataSize != -1) {
+ s"; Size of table is $dataSize"
+ } else {
+ ""
+ }
+ val oome =
new OutOfMemoryError(s"Not enough memory to build and
broadcast the table to " +
s"all worker nodes. As a workaround, you can either disable
broadcast by setting " +
s"${SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key} to -1 or
increase the spark driver " +
- s"memory by setting ${SparkLauncher.DRIVER_MEMORY} to a
higher value")
- .initCause(oe.getCause))
+ s"memory by setting ${SparkLauncher.DRIVER_MEMORY} to a
higher value$sizeMessage")
--- End diff --
I'm a bit hesitant to suggest exactly what value to say here, as we don't
actually know how close we were to having it all fit. I tried to come up with
something useful to say here which would be accurate and succinct, but felt it
either got too convoluted, or might only lead to more confusion as it was too
simplistic.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]