ulysses-you commented on PR #41609:
URL: https://github.com/apache/spark/pull/41609#issuecomment-1619603140

   I see the issue, when a shuffle based join is converted to broadcast hash 
join in AQE, some partitions are going to be skewed due to shuffle. For shuffle 
based join, we have handled these skewed partitions by spliting and replicating 
other side. However, we did not do this optimization for broadcast hash join.
   
   I think it makes sense to optimize skew with broadcast hash join. But it 
seems still useful when localShuffleReaderEnabled is enabled. Here is an 
attempt to resolve skewed partition with local shuffle reader, 
https://github.com/apache/spark/pull/40312.
   Can we handle both of these two cases ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to