alamb opened a new issue, #9873:
URL: https://github.com/apache/arrow-datafusion/issues/9873

   ### Is your feature request related to a problem or challenge?
   
   The common subexpression elimination pass copies many `Expr`s around. You 
can see this performance impact this has by looking at the screenshot from 
https://github.com/apache/arrow-datafusion/issues/9637#issue-2189931564
   
   While we will fix the copying plan problem in  
https://github.com/apache/arrow-datafusion/issues/9637 I think there is more 
work to be done in the common sub expression code itself, which copies a 
significant number of Exprs and Strings around
   
   
![312892336-1c8214cc-09ec-41b2-aa5d-f52a5dfa4226](https://github.com/apache/arrow-datafusion/assets/490673/2f46700d-396e-4298-b34b-8e354ab87732)
   
   
   ### Describe the solution you'd like
   
   Figure out how to avoid `clone`ing `Expr`s in the 
https://github.com/apache/arrow-datafusion/blob/main/datafusion/optimizer/src/common_subexpr_eliminate.rs
   
   1. The `Expr`s themselves
   2. Avoid creating Strings for `Identifier`
   
   We should see a significant improvement in the sql_planner benchmarks:
    
   ```shell
   cargo bench --bench sql_planner
   ```
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   I noticed this while reviewing 
https://github.com/apache/arrow-datafusion/pull/9871


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to