[ 
https://issues.apache.org/jira/browse/SPARK-35264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17339958#comment-17339958
 ] 

ulysses you commented on SPARK-35264:
-------------------------------------

[~dongjoon] thank you for moving this to AQE umbrella, I missed that before.

> Support AQE side broadcastJoin threshold
> ----------------------------------------
>
>                 Key: SPARK-35264
>                 URL: https://issues.apache.org/jira/browse/SPARK-35264
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: ulysses you
>            Assignee: ulysses you
>            Priority: Major
>             Fix For: 3.2.0
>
>
> The main idea here is that make join config isolation between normal planner 
> and aqe planner which shared the same code path.
> Actually we don not very trust using the static stat to consider if it can 
> build broadcast hash join. In our experience it's very common that Spark 
> throw broadcast timeout or driver side OOM exception when execute a bit large 
> plan. And due to braodcast join is not reversed which means if we covert join 
> to braodcast hash join at first time, we(AQE) can not optimize it again, so 
> it should make sense to decide if we can do broadcast at aqe side using 
> different sql config.
> In order to achieve this we use a specific join hint in advance during AQE 
> framework and then at JoinSelection side it will take and follow the inserted 
> hint.
> For now we only support select strategy for equi join, and follow this order
>  1. mark join as broadcast hash join if possible
>  2. mark join as shuffled hash join if possible
> Note that, we don't override join strategy if user specifies a join hint.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to