[
https://issues.apache.org/jira/browse/SPARK-32470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan resolved SPARK-32470.
---------------------------------
Fix Version/s: 3.1.0
Resolution: Fixed
Issue resolved by pull request 29276
[https://github.com/apache/spark/pull/29276]
> Remove task result size check for shuffle map stage
> ---------------------------------------------------
>
> Key: SPARK-32470
> URL: https://issues.apache.org/jira/browse/SPARK-32470
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 2.4.6, 3.0.0, 3.1.0
> Reporter: Wei Xue
> Assignee: Wei Xue
> Priority: Minor
> Fix For: 3.1.0
>
>
> The task result of a shuffle map stage is not the query result but instead is
> only map status and metrics accumulator updates. Aside from the metrics that
> can vary in size, the total task result size solely depends on the number of
> tasks. And the number of tasks can get large regardless of the stage's output
> size. For example, the number of tasks generated by `CartesianProduct` is
> square of "spark.sql.shuffle.partitions", say if
> "spark.sql.shuffle.partitions" is set to 200, you get 40,000 tasks, if set to
> 500, you get 250,000 tasks, which can easily error on the default limit of
> `spark.driver.maxResultSize`:
>
> {code:java}
> org.apache.spark.SparkException: Job aborted due to stage failure: Total size
> of serialized results of 66496 tasks (4.0 GiB) is bigger than
> spark.driver.maxResultSize (4.0 GiB)
> {code}
>
> However, map status and accumulator updates are used by the driver to update
> the overall map stats and metrics of the query, and they are not cached on
> the driver, so they won't cause catastrophic memory issues on the driver. So
> we should remove this check for shuffle map stage tasks.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]