adriangb commented on code in PR #15801: URL: https://github.com/apache/datafusion/pull/15801#discussion_r2070247089
########## datafusion/physical-plan/src/execution_plan.rs: ########## @@ -471,39 +471,53 @@ pub trait ExecutionPlan: Debug + DisplayAs + Send + Sync { Ok(None) } - /// Attempts to recursively push given filters from the top of the tree into leafs. - /// - /// This is used for various optimizations, such as: - /// - /// * Pushing down filters into scans in general to minimize the amount of data that needs to be materialzied. - /// * Pushing down dynamic filters from operators like TopK and Joins into scans. - /// - /// Generally the further down (closer to leaf nodes) that filters can be pushed, the better. - /// - /// Consider the case of a query such as `SELECT * FROM t WHERE a = 1 AND b = 2`. - /// With no filter pushdown the scan needs to read and materialize all the data from `t` and then filter based on `a` and `b`. - /// With filter pushdown into the scan it can first read only `a`, then `b` and keep track of - /// which rows match the filter. - /// Then only for rows that match the filter does it have to materialize the rest of the columns. - /// - /// # Default Implementation - /// - /// The default implementation assumes: - /// * Parent filters can't be passed onto children. - /// * This node has no filters to contribute. - /// - /// # Implementation Notes - /// - /// Most of the actual logic is implemented as a Physical Optimizer rule. - /// See [`PushdownFilter`] for more details. - /// - /// [`PushdownFilter`]: https://docs.rs/datafusion/latest/datafusion/physical_optimizer/filter_pushdown/struct.PushdownFilter.html - fn try_pushdown_filters( Review Comment: Agreed I'd prefer a single method as well - but as per https://github.com/apache/datafusion/pull/15770#discussion_r2052311257 we probably would have had to add a new method to the existing implementation anyway. I don't see a way to have a single method without making it absurdly convoluted just to avoid 2 methods. I'll also point out that there are multiple dimensions of filter pushdown for each node: - does the node allow parent filters to be pushed down to it's children? - does the node want to add any filters to be passed to it's children? - does the node need to handle the result of pushdown? Having two methods makes it easy to have canned implementations for each one of these independently and combine/compose them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org