alamb opened a new pull request, #9946: URL: https://github.com/apache/arrow-datafusion/pull/9946
(WIP not read for review) ## Which issue does this PR close? Part of https://github.com/apache/arrow-datafusion/issues/9637 (based on some ideas from https://github.com/apache/arrow-datafusion/pull/9708). 🙏 @jayzhan211 Part of closes https://github.com/apache/arrow-datafusion/pull/9768 ## Rationale for this change Make planning / everything faster by not copying as much ## What changes are included in this PR? Implement suggestion by @peter-toth https://github.com/apache/arrow-datafusion/pull/9780#issuecomment-2031702263 on https://github.com/apache/arrow-datafusion/pull/9780 and make the existing tree node API faster Basically this uses a trick to rewrite `Arc<LogicalPlan>` in place (if possible) This alone doesn't really impact performance ## Are these changes tested? Performance tests: TBD <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ## Are there any user-facing changes? Not yet, but I think it is needed for performance Once this is done I think I can stop a bunch more copies in the optimizer as another PR (rewrite it to use TreeNodeRewriter API) - [ ] Rewrite optimizer to use TreeNodeRewriter (so this code is used) - [ ] Add special cases / rewrite other optimizer passes to not copy their nodes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
