JkSelf commented on pull request #31941: URL: https://github.com/apache/spark/pull/31941#issuecomment-807937198
> I think it is better to make use of the AQE framework to reuse the broadcast exchange or newQueryStage. @cloud-fan I may need to explain a little bit more about this. 1. In my understanding, `PlanDynamicPruningFilters` rule is just simply judge whether there is an exchange that can be reused to decide whether to insert DPP or not. And the process of real reuse is in `ReuseExchange` rule. I think this way of thinking is clearer. 2. When AQE was enabled, we implemented the `ReuseExchange` rule in the AQE Framework. When the exchange was created, we went to the `stageCache` to find out if there is an exchange that can be reused, and if there is, we reuse it. 3. In the `PlanAdaptiveDynamicPruningFilters` rule, I am more inclined to the idea of `PlanDynamicPruningFilters` rule, just add DPP filter by judging whether there is an exchange that can be reused. The real reuse process is left to AQE Framework instead of looking in the `stageCache` to create the reused exchange or calling the `newQueryStage` method to create a new quey stage in the `PlanAdaptiveDynamicPruningFilters` rule. Of course, we did this in [PR#31258](https://github.com/apache/spark/pull/31258). But I think we may need to make some improvements in subsequent implementations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
