adriangb commented on code in PR #15770:
URL: https://github.com/apache/datafusion/pull/15770#discussion_r2150949273
##########
datafusion/physical-plan/src/execution_plan.rs:
##########
@@ -509,8 +510,22 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync {
///
/// The default implementation bars all parent filters from being pushed
down and adds no new filters.
/// This is the safest option, making filter pushdown opt-in on a per-node
pasis.
+ ///
+ /// Since this may perform deep modifications to the plan tree it is
called early in the optimization phase
+ /// and is not expected to be called multiple times on the same plan.
+ ///
+ /// A quick summary of the phases is below, see [`FilterPushdownPhase`]
for more details:
+ /// - [`FilterPushdownPhase::Pre`]: Filters get pushded down before most
other optimizations are applied.
+ /// At this stage the plan can be modified (e.g. when
[`ExecutionPlan::handle_child_pushdown_result`] is called the plan may choose
to return an entirely new plan tree)
+ /// but subsequent optimizations may also rewrite the plan tree
drastically, thus it is *not guaranteed* that a [`PhysicalExpr`] can hold on to
a reference to the plan tree.
+ /// During this phase static filters (such as `col = 1`) are pushed down.
+ /// - [`FilterPushdownPhase::Post`]: Filters get pushed down after most
other optimizations are applied.
+ /// At this stage the plan tree is expected to be stable and not change
drastically, and operators that do filter pushdown during this phase should
also not change the plan tree.
Review Comment:
Sort of. I am mincing my words in the comments because the reality is that
to push down filters into `DataSourceExec` a new `DataSourceExec` has to be
created and the whole tree has to be replaced "in place" to reference the new
children. But the structure of the plan does not change, and it's pretty much
guaranteed that `ExecutionPlan::new_with_children` does the right thing in
terms of preserving internal state that might be pointed to (unlike
`EnforceSorting`).
I'm not sure how to detail that in a comment, it's somewhat confusing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]