erratic-pattern commented on issue #9873:
URL: https://github.com/apache/datafusion/issues/9873#issuecomment-2096703188

   Hey I am interested in helping with this. Maybe @peter-toth and I can divide 
our efforts here? Let me know what you've worked on so far, and I can figure 
out how to help.
   
   One thing I see in particular that's not directly cloning the `LogicalPlan`s 
and `Expr`s, but may be putting pressure on the global allocator, is the 
`Identifiers` in the 
[ExprSet](https://github.com/apache/datafusion/blob/2bbfbdf6699907fd8ec094f5f5600af7fe13946b/datafusion/optimizer/src/common_subexpr_eliminate.rs#L47)
 which are represented as `String`s produced by `Display`ing the `Expr`s. 
   
   Assuming that the new zero-copy implementation will continue using the 
`ExprSet`, maybe I could look into efficiently hashing subexpressions to 
produce numeric identifiers that are easier to copy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to