alamb commented on PR #9708:
URL:
https://github.com/apache/arrow-datafusion/pull/9708#issuecomment-2012196911
Update: I think this PR now works well enough to show this is a promising
approach. Specifically, by just fixing SimplifyExprs to not copy plans around,
the planning benchmark goes 25% faster. I am pretty sure if we fix common
subexpr eliminate and a few other passes we could get the benchmark running 2x
as fast
```
physical_plan_tpch_all time: [54.371 ms 54.512 ms 54.675 ms]
change: [-24.152% -23.333% -22.648%] (p = 0.00 <
0.05)
Performance has improved.
```
The code in this PR is atrocious, but I think works well enough to
demonstrate the idea
The core problem is that when the LogicalPasses work, they copy the
LogicalPlans many times (e.g. each call to `new_with_exprs` or
`new_with_children` results in Copying the entire node and its embedded exprs
(though not its children as they are `Arc`d)
cc @sadboy who I think has observed this before as well
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]