cloud-fan commented on PR #44013: URL: https://github.com/apache/spark/pull/44013#issuecomment-2421167393
Hi @ulysses-you , we tried to use this `RuleContext` framework and found some design issues: - the AQE rules are not always stage-local, but can transform the whole plan. It doesn't make sense to put stage information in the `RuleContext`. e.g. how can `OptimizeSkewedJoin` rule leverage it? What does the `isFinalStage` even mean in this context? - The protocol for setting this plan fragment level configs is very hacky. The test uses one custom rule to update `RuleContext`, expecting another rule to access the `RuleContext`. This assumes that there is a global `RuleContext` instance shared between all rules, which is quite messy if the rule transforms the whole plan and needs to deal with multiple stages. I think a better design is to put the context (plan fragment level confs or something more general) in the query stage itself. Then all the rules can either update or consume the query stage context. The only problem is that we don't have a query stage node for the final result stage, but we should add one (@liuzqt is working on it). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
