Github user bersprockets commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21899#discussion_r212756302
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala
 ---
    @@ -118,12 +119,20 @@ case class BroadcastExchangeExec(
               // SparkFatalException, which is a subclass of Exception. 
ThreadUtils.awaitResult
               // will catch this exception and re-throw the wrapped fatal 
throwable.
               case oe: OutOfMemoryError =>
    -            throw new SparkFatalException(
    +            val sizeMessage = if (dataSize != -1) {
    +              s"${SparkLauncher.DRIVER_MEMORY} by at least the estimated 
size of the " +
    +                s"relation ($dataSize bytes)"
    --- End diff --
    
    @rezasafi The dataSize appears to be inflated by 2-3 times, at least 
relative to the size of the actual data in the table. That may be because these 
relations are backed by map-like objects that have keys and (likely) other 
internal structures.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to