alamb opened a new pull request, #9780: URL: https://github.com/apache/arrow-datafusion/pull/9780
(THIS IS NOT READY FOR REVIEW -- I am simply putting it up here to save my in progress work) ## Which issue does this PR close? Part of https://github.com/apache/arrow-datafusion/issues/9637 ## Rationale for this change See https://github.com/apache/arrow-datafusion/issues/9637 TLDR is that the optimizer currently copies logical plans many times uncessairly. This is both slow and requires many memory allocations I am trying to keep this PR relatively small, and I have other improvements in mind for follow on PRs (see below). This PR is now finally feasible thanks to the really nice TreeNode cleanup from peter toth and Berka ## What changes are included in this PR? 1. Rewrite `Optimzier` to use `TreeNode` API 2. Add ability to rewrite `LogicalPlan` in place (aka `&mut LogicalPlan`) ## Are these changes tested? Functionally covered by existing tests Performance results: TODO ## Are there any user-facing changes? 1. Faster planning 2. Ability to rewrite `LogicalPlan` in place using `TreeNode` API ## Planned Follow Ons: 1. Add API to `OptimizerRule` to rewrite in place (and avoid yet another copy) 2. Add an example of how to use the "in place tree write" API -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
