manuzhang commented on pull request #30494: URL: https://github.com/apache/spark/pull/30494#issuecomment-734270568
@maryannxue, @cloud-fan, Sorry for not raising up earlier but I'd like to discuss a case which seems to not have been covered here. 1. The final stage *before write* is a `SortMergeJoin` with partitioning that match the target table. 2. AQE switches the `SortMergeJoin` to `BroadcastHashJoin` as one side is smaller than broadcast threshold. 3. The probe side, with a different partitioning, is applied `OptimizeLocalShuffleReader`, which breaks the user intended partitioning. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
