[GitHub] [spark] cloud-fan commented on pull request #30829: [SPARK-33832][SQL] Add an option in AQE to mitigate skew even if it c…

GitBox Tue, 19 Jan 2021 21:58:27 -0800


cloud-fan commented on pull request #30829:
URL: https://github.com/apache/spark/pull/30829#issuecomment-763355925



   `OptimizeSkewJoin` already runs `EnsureRequirements` inside it, the only 
change we need is to not give up the optimization even if extra shuffles are 
added.
   
   Query stage is quite self-contained. `queryStagePreparationRules` can't see 
the query plan of query stages. During re-optimization, query stage becomes a 
leaf node `LogicalQueryStage`.
   
   One problem is, `queryStageOptimizerRules` can't assume that query stage 
doesn't have shuffles in the middle. We need to revisit these rules.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on pull request #30829: [SPARK-33832][SQL] Add an option in AQE to mitigate skew even if it c…

Reply via email to