Zhen Wang created SPARK-51203:
---------------------------------

             Summary: ForceOptimizeSkewedJoin does not take effect for child 
aggregations in skewed join
                 Key: SPARK-51203
                 URL: https://issues.apache.org/jira/browse/SPARK-51203
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Zhen Wang


ForceOptimizeSkewedJoin allows optimizing skewed join even if
introduce extra shuffle, but currently it only works for aggregation after 
join, not for aggregations in children of join. Like:
{code:java}
HashAggregate
     |
  Exchange 
     |
HashAggregate     Exchange (skewed side)
     |               |
   Sort            Sort
     \               /
 SortMergeJoin(isSkewJoin = true) {code}
When we enable ForceOptimizeSkewedJoin, can we introduce extra shuffle for join 
child so as to optimize skewed join? Like:
{code:java}
  HashAggregate
       |
    Exchange 
       |
  HashAggregate     
       |
    Exchange
(froce extra shuffle)       Exchange (skewed side)
       |                       |
     Sort                     Sort
       \                      /
      SortMergeJoin(isSkewJoin = true) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to