[
https://issues.apache.org/jira/browse/SPARK-50257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-50257:
----------------------------------
Target Version/s: (was: 4.0.0)
> [Core]If a stage contains ExpandExec, the CoalesceShufflePartitions rule
> will not be adjusted during the AQE phase
> -------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-50257
> URL: https://issues.apache.org/jira/browse/SPARK-50257
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 4.0.0
> Reporter: guihuawen
> Priority: Major
> Fix For: 4.0.0
>
> Attachments: 截屏2024-11-07 13.52.45.png
>
>
> 【sql】
> {code:java}
> // code placeholder
> SELECT
> /*+ SHUFFLE_MERGE(b) */
> s_date,
> sum(s_quantity * i_price) AS total_sales
> FROM
> sales a
> JOIN items b ON s_item_id = i_item_id
> WHERE
> i_price < 10
> GROUP BY
> s_date with rollup;
> {code}
> Set spark.sql.shuffle.partitions=1000
> After aqe:
> !截屏2024-11-07 13.52.45.png|width=444,height=431!
> The parallel reads in the ExpandExecut phase have been adjusted to 71,
> reducing parallelism. The ExpandExecut phase can lead to data expansion, and
> a decrease in parallelism can result in longer task execution times.
> If AGE is turned off as a whole, AQE optimization cannot be enjoyed in other
> stages. If it is found that ExpandExec is included in the current stage,
> partition merging will not be performed for this issue.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]