cloud-fan commented on PR #44013:
URL: https://github.com/apache/spark/pull/44013#issuecomment-2421167393

   Hi @ulysses-you , we tried to use this `RuleContext` framework and found 
some design issues:
   - the AQE rules are not always stage-local, but can transform the whole 
plan. It doesn't make sense to put stage information in the `RuleContext`. e.g. 
how can `OptimizeSkewedJoin` rule leverage it? What does the `isFinalStage` 
even mean in this context?
   - The protocol for setting this plan fragment level configs is very hacky. 
The test uses one custom rule to update `RuleContext`, expecting another rule 
to access the `RuleContext`. This assumes that there is a global `RuleContext` 
instance shared between all rules, which is quite messy if the rule transforms 
the whole plan and needs to deal with multiple stages.
   
   I think a better design is to put the context (plan fragment level confs or 
something more general) in the query stage itself. Then all the rules can 
either update or consume the query stage context. The only problem is that we 
don't have a query stage node for the final result stage, but we should add one 
(@liuzqt is working on it).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to