matthewgapp commented on code in PR #7581: URL: https://github.com/apache/arrow-datafusion/pull/7581#discussion_r1446401353
########## datafusion/expr/src/logical_plan/plan.rs: ########## @@ -112,6 +112,8 @@ pub enum LogicalPlan { /// produces 0 or 1 row. This is used to implement SQL `SELECT` /// that has no values in the `FROM` clause. EmptyRelation(EmptyRelation), + /// A named temporary relation with a schema. + NamedRelation(NamedRelation), Review Comment: @jonahgao, could you provide the rationale for your suggested strategy? I'm interested in understanding why it might be more effective than the current implementation. Performance is critical to our use case. And the implementation for recursion is very sensitive to performance considerations, as the setup for execution and stream management isn't amortized over all input record batches. Instead, it's incurred with each iteration. For instance, we've observed a substantial performance boost—up to 30 times faster—by eliminating certain intermediate nodes, like coalesce, from our plan (as evidenced in [this PR](https://github.com/matthewgapp/arrow-datafusion/pull/2)). I've drafted another PR that appears to again double the speed of execution merely by omitting metric collection in recursive sub-graphs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org