Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

via GitHub Wed, 09 Apr 2025 12:44:49 -0700


adriangb commented on PR #15566:
URL: https://github.com/apache/datafusion/pull/15566#issuecomment-2790824023


   For comparison, here is roughly what I had before doing the recursion as 
part of the `OptimizerRule`: 
https://github.com/pydantic/datafusion/blob/fbf93a2bdd0a5c1532336026dfa71ac7305c1655/datafusion/physical-optimizer/src/filter_pushdown.rs
   
   As you can see it ends up splitting the flow up into multiple steps:
   1. Ask the current node for any filters it wants to push down and which 
filters can be pushed down (possibly with modification) into which children.
   2. Recurse into the children.
   3. Carry the result of the pushdown for each filter back up.
   4. Combine results for each child together to get an overall pushed/not 
pushed for each filter.
   5. Ask the current node if it can handle any of the remaining filters.
   6. Return to the caller.
   
   This all gets pretty complicated and tracking the state is convoluted. And 
it doesn't even handle certain edge cases that I think are easyish to handle 
with the new APIs.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Re: [PR] ExecutionPlan: add APIs for filter pushdown & optimizer rule to apply them [datafusion]

Reply via email to