adriangb commented on code in PR #15770: URL: https://github.com/apache/datafusion/pull/15770#discussion_r2150949273
########## datafusion/physical-plan/src/execution_plan.rs: ########## @@ -509,8 +510,22 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync { /// /// The default implementation bars all parent filters from being pushed down and adds no new filters. /// This is the safest option, making filter pushdown opt-in on a per-node pasis. + /// + /// Since this may perform deep modifications to the plan tree it is called early in the optimization phase + /// and is not expected to be called multiple times on the same plan. + /// + /// A quick summary of the phases is below, see [`FilterPushdownPhase`] for more details: + /// - [`FilterPushdownPhase::Pre`]: Filters get pushded down before most other optimizations are applied. + /// At this stage the plan can be modified (e.g. when [`ExecutionPlan::handle_child_pushdown_result`] is called the plan may choose to return an entirely new plan tree) + /// but subsequent optimizations may also rewrite the plan tree drastically, thus it is *not guaranteed* that a [`PhysicalExpr`] can hold on to a reference to the plan tree. + /// During this phase static filters (such as `col = 1`) are pushed down. + /// - [`FilterPushdownPhase::Post`]: Filters get pushed down after most other optimizations are applied. + /// At this stage the plan tree is expected to be stable and not change drastically, and operators that do filter pushdown during this phase should also not change the plan tree. Review Comment: Sort of. I am mincing my words in the comments because the reality is that to push down filters into `DataSourceExec` a new `DataSourceExec` has to be created and the whole tree has to be replaced "in place" to reference the new children. But the structure of the plan does not change, and it's pretty much guaranteed that `ExecutionPlan::new_with_children` does the right thing in terms of preserving internal state that might be pointed to (unlike `EnforceSorting`). I'm not sure how to detail that in a comment, it's somewhat confusing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org