blaginin opened a new pull request, #14684:
URL: https://github.com/apache/datafusion/pull/14684

   ## Which issue does this PR close?
   
   Related to https://github.com/apache/datafusion/issues/14563  
   
   ## Rationale for this change  
   
   Currently, when `with_column_renamed` is called, a new Projection layer is 
always created. For example, in the `dataframe` benchmark, logical plan looks 
like this:  
   
   ![Zen Browser 2025-02-15 18 38 
03](https://github.com/user-attachments/assets/ec8cb4b9-3467-4599-b505-baa5557ade85)
  
   
   These layers do not affect the query itself (as they are removed by 
`optimize_projections`), but they make renaming new columns (and optimization 
itself) quite slow.  
   
   ## What changes are included in this PR?  
   
   I added an optimization for one edge case - when there's already a 
projection on top and we're adding a new one. This is a common scenario, 
especially when renaming many columns (as in the `dataframe` benchmark). 
Instead of adding a new projection layer on top, we replace the existing one if 
possible.  
   
   ## Are these changes tested?  
   
   Extended test case  
   
   ## Are there any user-facing changes?  
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to