cloud-fan commented on pull request #30829: URL: https://github.com/apache/spark/pull/30829#issuecomment-768047646
Ah sorry for my bad memory, so the comment in `OptimizeSkewedJoin` is actually the stale one... I get your point that it's like backtracking, but it does make the control flow more complicated. And using exceptions in control flow is anti-pattern (see [here](https://softwareengineering.stackexchange.com/questions/189222/are-exceptions-as-control-flow-considered-a-serious-antipattern-if-so-why)), we need more effort to refactor the code of the AQE loop. So I'd like to avoid changing the control flow if possible. I don't see any blockers to run `OptimizeSkewedJoin` in the stage preparation. We can update `CoalesceShufflePartitions` to make it work even if there are skewed partition specs, or do the coalesce in `OptimizeSkewedJoin`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
