alamb commented on PR #9708:
URL: 
https://github.com/apache/arrow-datafusion/pull/9708#issuecomment-2012196911

   Update: I think this PR now works well enough to show this is a promising 
approach. Specifically, by just fixing SimplifyExprs to not copy plans around, 
the planning benchmark goes 25% faster. I am pretty sure if we fix common 
subexpr eliminate and a few other passes we could get the benchmark running 2x 
as fast
   
   ```
   physical_plan_tpch_all  time:   [54.371 ms 54.512 ms 54.675 ms]
                           change: [-24.152% -23.333% -22.648%] (p = 0.00 < 
0.05)
                           Performance has improved.
   ```
   
   The code in this PR is atrocious, but I think works well enough to 
demonstrate the idea
   
   The core problem is that when the LogicalPasses work, they copy the 
LogicalPlans many times (e.g. each call to `new_with_exprs` or 
`new_with_children` results in Copying the entire node and its embedded exprs 
(though not its children as they are `Arc`d)
   
   
   cc @sadboy  who I think has observed this before as well


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to