maryannxue opened a new pull request #25703: [SPARK-29002][SQL] Avoid changing 
SMJ to BHJ if the build side has a high ratio of empty partitions
URL: https://github.com/apache/spark/pull/25703
 
 
   ### What changes were proposed in this pull request?
   This PR aims to avoid AQE regressions by avoiding changing a sort merge join 
to a broadcast hash join when the expected build plan has a high ratio of empty 
partitions, in which case sort merge join can actually perform faster. This PR 
achieves this by adding an internal join hint in order to let the planner know 
which side has this high ratio of empty partitions and it should avoid planning 
it as a build plan of a BHJ. Still, it won't affect the other side if the other 
side qualifies for a build plan of a BHJ.
   
   ### Why are the changes needed?
   It is a performance improvement for AQE.
   
   ### Does this PR introduce any user-facing change?
   No.
   
   ### How was this patch tested?
   Added UT.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to